1. Introduction
Vehicle exhaust pollution is an important factor threatening public health [
1]. Dynamic estimation of mobile pollution sources changing over time on the road network is the fundamental data for air pollution control and public health promotion [
2]. Pollution sources are required to be controlled in coordination in air quality management [
3]. Fixed pollution sources such as steel plants [
4], power plants [
5] and meteorological parameters [
6] have been detected with high space-time resolution. However, the emissions of mobile vehicle sources still remain at the level of annual total amount measurement due to the limitation of emission detection technology [
7]. The time and space granularity of vehicle emissions evaluation makes it difficult to measure it consistently in comparison with other pollution sources.
Traffic volume covering every link of the entire road network is one of the important parameters for dynamic emission estimation, which is difficult to collect based on the existing traffic detection with low coverage and is not estimated effectively [
8]. The short-term traffic flow prediction method based on machine learning, which depends on huge historical traffic flow data, has been well studied [
9]. Historical flow data of high-grade roads are sufficient, but the flow data of low-grade roads are extremely lacking [
10]. Traffic flow characteristics extracted from historical laws based on machine learning reflect with difficulty the real-time traffic state on the road network, especially when accidental traffic events occur.
Although the whole sample traffic volumes of the entire road network are difficult to collect, the detection technology of speed data from the large-scale road network is well developed [
11,
12]. The traffic fundamental diagram model describes the relationship between speed and volume [
13]. It is an effective method to estimate volume by the traffic fundamental diagram model based on speed data. The prediction accuracy for traffic volume based on the traffic fundamental diagram is improved with the increase of time granularity, which is relatively low when the time granularity is at the minute level and is relatively high when the time granularity is at the hour level [
14]. This method is not widely used for short-term traffic flow prediction. The hourly granularity volume has met the demand for emission data. The traffic fundamental diagram is used to estimate the dynamic real-time flow based on the timely speed data in this paper.
The influencing factors of MFD are analyzed quantitatively, including road types, weather conditions, traffic patterns and travel periods. A novel estimation method of whole sample traffic flows and emissions based on multifactor MFD is proposed. The research results of this paper can be implemented for the preparation of a dynamic vehicle emission inventory on the road network, which is helpful for transportation and environmental departments to formulate policies of energy conservation and emission reduction. It plays a major role in promoting public health.
2. Literature Review
The quantitative estimation of dynamic emissions is fundamental for pollution control. Mobile sources have become an important part of air pollution, accounting for 55.6% of the total nitrogen oxide emissions in China in 2020 [
15]. The Ministry of Ecology and Environment of China plans to evaluate emission sources with a uniform time granularity. Industrial, agricultural and living sources are fixed sources, which make it easy to obtain continuous dynamic data due to fewer pollution sources and simple monitoring technology [
4,
5,
6]. Mobile sources have complex detection technologies and high testing costs, which remain at the level of static annual total quantity evaluation [
16]. The evaluation granularity of vehicle emissions cannot be consistent with other pollution sources in terms of time and space, which limits the real-time detection and overall evaluation of air pollution. Therefore, dynamic estimation of vehicle emissions has always been a bottleneck problem in dynamic air quality simulation.
One of the important reasons why it is difficult to estimate dynamic vehicle emissions is the lack of an effective method to estimate whole sample dynamic traffic volumes of urban road networks. Vehicle emissions can be calculated based on single vehicle emission factors and vehicle kilometers traveled (VKT) [
14]. The single vehicle emission factor is the emissions per kilometer of a single vehicle, which has a good research foundation and can be effectively obtained [
17]. The whole sample dynamic traffic volume is the traffic volume that dynamically changes over time on every road link of a road network, which is an important parameter to calculate the dynamic VKT [
16]. The main sources of traffic volume in existing studies include field traffic surveys, traffic simulations and traffic flow models. Field traffic surveys include fixed detector detection and manual surveys, which are costly and difficult. The road network coverage of field traffic survey data is limited, and it is difficult to obtain large-scale real-time dynamic data [
18]. The traffic simulation method [
19] has difficulty ensuring the consistency of the simulation environment and the actual traffic conditions. The traffic flow model method uses speed data to estimate volume based on traffic fundamental diagrams [
20]. Dynamic speed data of large-scale road networks can be easily obtained based on massive data, such as floating car data (FCD) [
11] and travel trajectories collected from mobile applications [
21], which have high temporal and spatial continuity. Dynamic traffic control, such as signal timing optimization, requires small time granularity (five minutes) and high estimation accuracy of traffic volume [
22]. The estimation accuracy of small time granularity traffic volume calculated using a fundamental diagram model cannot meet the requirement of the traffic control field; therefore, a fundamental diagram is not widely used in the traffic control field [
16]. The hourly time granularity volume meets the emission estimation requirement and can be estimated with high accuracy based on a traffic flow fundamental diagram [
14]. This paper proposes a method to calculate whole sample dynamic volumes of urban road networks based on traffic flow fundamental diagrams.
The traffic fundamental diagram was originally proposed by Greenshields in the form of a linear model in the 1930s [
23]. Then, the logarithmic, exponential and vehicle-following models were separately introduced by Greenberg [
24], Underwood [
25] and Pipes [
26]. In 1995, the Van Aerde model was proposed [
27] with four parameters, a single structure, continuous feature and flexible calibration. It was found that the Van Aerde model fits the traffic flow of highways and expressways well [
28]. The Underwood model has good performance in describing major arterial road and minor arterial road traffic flows [
20]. However, the data used to develop these models are mostly collected from a single observation point. With the development of urban road networks and increasing traffic flow complexity, traffic fundamental diagrams have gradually expanded from single-section to multi-section networks. The concept of a macroscopic fundamental diagram (MFD) was proposed [
29], and the existence of MFDs was verified based on field survey data and simulation data [
30,
31]. Some scholars [
32,
33] proposed new fundamental traffic diagrams based on a MFD. Existing research on MFDs is mainly based on simple structure road networks, excellent traffic conditions or simulation methods. MFD models based on real traffic conditions and complex road networks need to be further studied.
Traffic flow change patterns are divided to improve the effectiveness and accuracy of traffic volume prediction. It was pointed out that the travel behaviors of residents are cyclical and repetitive [
18], which led to multiple but limited patterns of traffic flow patterns. Representative traffic flow patterns were constructed based on traffic volume and speed data in existing studies as follows. Jia et al. [
34] proposed a spatiotemporal neural network model to predict traffic flow for each road segment. The traffic flow was divided into recent, daily and weekly parts. A multimode traffic flow prediction method with clustering-based attention convolution LSTM was proposed by Huang et al. [
35] to model spatial-temporal data of traffic flow by integrating weather, wind speed, holidays and other factors to improve the prediction accuracy. Ma et al. [
36] proposed a dynamic time warping method to select the appropriate historical data for daily traffic flow forecasting based on daily traffic flow pattern influence factors, such as season, day of the week, weather and holiday. The prediction effectiveness and accuracy of traffic volume are improved by dividing traffic flow patterns. Field survey traffic flows on highways or expressways are used as clustering indexes in most studies, which do not take into account the influence of the geographic location, road type and land use function on the division of traffic flow patterns. It is difficult to discover the differences among various traffic flow change patterns. A traffic flow clustering method considering various factors is proposed based on various road types of traffic flow data in this paper.
Above all, traffic flows can be clustered into limited typical patterns due to the regularity of travel and the grading characteristics of roads. Predicting flows for different patterns separately is an effective method to improve the prediction accuracy. Although whole sample dynamic traffic volumes of urban road networks are difficult to obtain, the large-scale speed data of a road network can be obtained based on a floating car system. The traffic flow model is used in this paper to predict traffic volume based on speed data for different patterns.
3. Methodology
3.1. Research Steps
The general methodology of this study is divided into the following steps, as shown in
Figure 1.
First, the data preparation of traffic speed and volume for developing traffic flow patterns and traffic fundamental diagram models is introduced. The speed and volume data are aggregated into a 1-h granularity. Then, the time and location information of different data sources are matched.
Second, a clustering method of traffic flow patterns based on multidimensional features is constructed. The optimal traffic flow pattern clustering method is designed using different clustering methods. A library of typical traffic flow patterns under various traffic conditions is constructed.
Third, a traffic flow pattern recognition method based on speed indexes in different periods is constructed. Indexes suitable for different traffic application scenarios are designed to recognize traffic flow patterns rapidly.
Finally, a method is proposed to estimate whole sample dynamic traffic volumes of urban road networks based on clustering and recognition of traffic flow patterns. Multifactor MFD models are constructed based on road types, traffic flow patterns and morning and evening peak hours on weekdays. The dynamic traffic volume is estimated based on real-time speed data. The dynamic emissions of a road network before and after the traffic restriction policy are calculated as a case study.
3.2. Data Source and Preparation
The data used in this study include three parts.
(1) The average spatial speed data are from FCD in 5-min intervals of various road types in Beijing. The number of days for highway, expressway, major arterial road and minor arterial road data were 164, 175, 265 and 213, respectively, in 2018. Data on weekdays, weekends and holidays under different weather conditions are included. The attributes of average spatial speed data include link ID, road name, date, time, road type and average spatial speed.
(2) The traffic volume data from remote traffic microwave sensors (RTMSs) in 2-min intervals of various road types in Beijing are used. The number of days for highway, expressway, major arterial road and minor arterial road data were 132, 144, 205 and 153, respectively, in 2018. The attributes of average spatial speed data include detector number, road name, date, time and traffic volume.
(3) The traffic volume from traffic survey data at 5-min intervals under various road types in Beijing is used. The RTMS data on highways and expressways are relatively abundant, whereas the data on major arterial roads and minor arterial roads are relatively lacking. The traffic volume survey data are used to supplement the RTMS volume data. The attributes of traffic survey data include survey date, road name, vehicle type and traffic volume.
After data quality control, the speed data are integrated according to Equation (1).
where
is the speed of hour
j in 1-h intervals,
vi is the original (60/n) min speed in each 1-h interval and
n is the number of the original (60/n) min data in each 1-h interval. The traffic volume data are integrated according to Equation (2).
where is the traffic volume of hour
j in 1-h intervals and
is the original (60/n) min traffic volume in each 1-h interval.
Then, the speed and traffic data are normalized using Equation (3) to avoid the effect of the quantity of the data.
where
is the speed (km/h) or traffic volume (pcu/h/lane),
is the normalized speed or traffic volume and
and
are the minimum and maximum values of all records in a day, respectively.
5. Traffic Flow Pattern Recognition Method
5.1. Traffic Flow Pattern Recognition Algorithms
Pattern recognition algorithms based on intelligent algorithms have good performance in the traffic flow field. A back propagation (BP) neural network is widely used because of the simplicity and flexibility of the structure. An optimization algorithm is used to improve the slow convergence and easily fall into the extreme value of a BP neural network, such as a genetic algorithm (GA) and a simulated annealing genetic algorithm (SAGA). A DBN has the ability to extract high-dimensional abstract features of traffic flows through a hidden multilayer structure, which recognizes the pattern of large-scale data efficiently. Therefore, the BP, GA-BP, SAGA-BP and DBN algorithms are used to recognize the traffic flow patterns [
41,
42,
43,
44]. The flow chart of each algorithm is shown in
Figure 7. The parameters of the algorithms are shown in
Table 3.
5.2. Traffic Flow Pattern Recognition Indexes
A variety of traffic flow pattern recognition indexes are designed according to traffic flow characteristics to adapt to different collection conditions of traffic data and traffic application scenarios.
- (1)
Index I: Speeds from 0:00 to 24:00
Speeds from 0:00 to 24:00 contain the complete traffic flow characteristics over time, which are suitable for pattern recognition of historical traffic flow data, such as the evaluation of effect on traffic policy.
- (2)
Index II: Speeds from 0:00 to 6:00
Speeds from 0:00 to 6:00 describe the characteristics of the early morning traffic flow, which are suitable for real-time traffic scenarios.
- (3)
Index III: Speeds from 0:00 to 12:00
Speeds from 0:00 to 12:00 describe the traffic flow characteristics in the early morning and morning peak hours, which have a degree of real-time performance.
- (4)
Index IV: Speeds from 6:00 to 10:00
Speeds from 6:00 to 10:00 describe the traffic flow characteristics of morning peak hours, which have a degree of real-time performance. Index IV has fewer dimensions, which is useful to improve the efficiency of pattern recognition.
- (5)
Index V: Speeds from 6:00 to 10:00 and 17:00 to 21:00
Speeds from 6:00 to 10:00 and 17:00 to 21:00 describe the traffic flow characteristics of morning and evening peak hours, which have a better ability to reflect the difference in traffic flow characteristics of different categories.
The pattern recognition results of different algorithms based on each index are analyzed on different road types, as shown by taking an expressway as an example in
Figure 8.
5.3. Pattern Recognition Evaluation Indexes
The evaluation indexes of the pattern recognition effect include the recognition rate based on a confusion matrix (CM) and the pattern recognition simulation time. The recognition rates and simulation times of different algorithms based on each index are shown in
Table 4.
The results of pattern recognition of the different algorithms based on each index are analyzed on different road types, and the conclusions are as follows.
(1) The accuracy and efficiency of pattern recognition of the DBN are the highest under various conditions. The GA-BP and SAGA-BP have better performance than the BP algorithm.
(2) The index of speeds from 0:00 to 24:00 has the best recognition effect, and the average correct recognition rate of all road types is 94.87%. The index of speeds from 6:00 to 10:00 and 17:00 to 20:00 has a good recognition effect, and the average recognition of all road types is 93.26%.
(3) The average recognition rates of the index of speeds from 0:00–12:00 and the index of speeds from 6:00–10:00 exceed 80%, and the highest average recognition rate is 89.39%.
(4) The index of speeds from 0:00 to 6:00 has a low recognition rate, and the correct recognition rate of all road types is 59.36%. The early-morning data in each category have relatively small differences, which are not suitable as indexes of traffic flow patterns.
7. Case Study
The traffic restriction implementation effect of diesel trucks of the national stage III emission standard is evaluated as a case study. The changes in dynamic vehicle emissions before and after the implementation of the traffic restriction policy are calculated.
The traffic restriction policy of diesel trucks of the national stage III emission standard was published on 1 December 2018. A diesel truck of the national stage III emission standard is forbidden to drive within the Fifth Ring Road (excluding) from 6:00 to 23:00 every day. Trucks are forbidden to drive over 8 tons (including) on the Fifth Ring Road. Data from weekdays on 5 November, 6 November and 7 November 2018, before the implementation of the policy, and 3 December, 4 December and 5 December 2018, after the implementation of the policy, are used to calculate the emissions within the Fifth Ring Road.
The calculation method of the road link emissions is shown in Equation (12):
where
Ei is the emission of road link
i (ton);
is the emission factor (g/km) of vehicle type
j when its speed is
v on road type
R,
is the traffic volume of road link
i (pcu/h); is the length of road link
i (km); and
is the proportion of vehicle type
j on road type
R. The vehicle emissions of the road network were the total emissions of each road link.
The emissions on the road network of each pollutant before and after the implementation of the traffic restriction policy in different periods is calculated, such as from 0:00 to 24:00, 6:00 to 23:00, 7:00 to 9:00, 17:00 to 19:00 and 20 to 22:00, as shown in
Table 7.
The emissions of each pollutant on the whole road network within the Fifth Ring Road all decrease after the implementation of the traffic restriction policy from 6:00 to 23:00. The CO2 emissions decrease from 36,817.12 tons to 34,144.84 tons, with a decrease rate of 7.26%. The NOx emission decreases from 135.96 tons to 118.05 tons, with a decreasing ratio of 13.18%. The CO emission decreases from 522.25 tons to 495.58 tons, with a decreasing ratio of 5.11%. The PM emissions decrease from 1.95 tons to 1.57 tons, with a decreasing ratio of 19.78%. The HC emission decreases from 96.56 tons to 91.56 tons, with a decreasing ratio of 5.18%.
The emission intensity on the road network of each pollutant before and after the implementation of the traffic restriction policy in different periods is calculated, such as from 0:00 to 24:00, 6:00 to 23:00, 7:00 to 9:00, 17:00 to 19:00 and 20 to 22:00, as shown in
Table 8.
The emission intensity on the road network of each pollutant before and after the implementation of the traffic restriction policy in different periods is calculated, such as from 0:00 to 24:00, 7:00 to 9:00, 17:00 to 19:00 and 20 to 22:00 before and after the implementation of the traffic restriction policy is shown on the GIS map in
Figure 14.
The emission intensities of CO2, NOx, CO, HC and PM in the road network all decrease after the traffic restriction. The emission intensities of CO2, NOx, CO, HC and PM decreased by 5.50%, 11.48%, 3.28%, 3.33% and 19.05%, respectively, from 0:00 to 24:00. The emission intensities of CO2, NOx, CO, HC and PM decreased by 6.03%, 13.08%, 3.48%, 3.55% and 21.44%, respectively, from 6:00 to 23:00. The emission intensities of CO2, NOx, CO, HC and PM decreased by 27.91%, 59.52%, 14.03%, 15.68% and 73.76%, respectively, from 2:00 to 4:00. The emission intensities of CO2, NOx, CO, HC and PM decreased by 3.12%, 7.84%, 1.20%, 1.43% and 14.89%, respectively, from 7:00 to 9:00. The emission intensities of CO2, NOx, CO, HC and PM decreased by 5.53%, 10.66%, 3.26%, 3.31% and 18.16%, respectively, from 10:00 to 12:00. The emission intensities of CO2, NOx, CO, HC and PM decreased by 6.52%, 11.37%, 4.57%, 4.43% and 18.27%, respectively, from 17:00 to 19:00. The emission intensities of CO2, NOx, CO, HC and PM decreased by 6.53%, 14.46%, 3.89%, 3.93% and 22.50%, respectively, from 20:00 to 22:00.
8. Conclusions
The main research results and conclusions of this paper are as follows:
(1) The SOM clustering method based on the direct indexes of time-varying speed differences effectively clusters urban traffic flow. Traffic flow patterns describe more than 90% of the traffic flows on urban road networks, such as Monday to Thursday, Friday, Saturday, Sunday, rainy days, holidays, prominent evening peak and prominent morning peak patterns.
(2) The recognition accuracy and efficiency of the DBN are the highest under various conditions. The GA-BP and SAGA-BP have better performance than the BP algorithm. The average correct recognition rates of different road types of speeds from 0:00 to 24:00, 0:00 to 12:00, 6:00 to 10:00, 6:00 to 10:00 and 17:00 to 21:00 are 94.87%, 82.45%, 80.96%, 93.26%, respectively. The early-morning data in each category are not suitable for traffic flow pattern recognition.
(3) In addition to the large-vehicle proportion and weather conditions identified by existing studies, it was found that the difference in driving behavior caused by different travel purposes significantly affected the fundamental diagrams, such as morning and evening peak hours, weekdays and holidays. The traffic capacity of the same road link in the morning peak hours is 3.47% higher than that in the evening peak hours. The traffic capacity of the same road link on holidays is 4.73% higher than that on weekdays. MFD models considering the road type, morning and evening peak hours, weekdays and holidays improve the flow measurement accuracy by 6.51% compared with that considering only road type.
(4) The case study shows that based on the traffic flow pattern clustering and the recognition method proposed in this paper, the traffic flow pattern is identified quickly, and the whole sample traffic flow on the urban road network is calculated quickly by speed data. The dynamic emissions on the road network are further acquired.