1. Introduction
With the rapid development of railway-related technologies and the continuous maturing of scientific theories, railway safety risks have attracted increasing research attention. Railway operation safety has become one of the key research directions in railway-related research. Various analysis methods have been used to conduct a large number of safety risk assessment analyses and experiments in different scenarios, such as car, machine, industrial, electric, and railway vehicles, aiming to develop scientific methods and decision-making supports for railway operation management departments to avoid or reduce the occurrence of high-speed railway safety accidents in practical scenarios.
Yankun Zhang [
1] divided and analyzed the factors of high-speed railway vehicle safety into four aspects: human, machine, environment, and management. Following the principle of establishing evaluation indicators, these main factors were selected as evaluation indicators, and an AHP-fuzzy comprehensive evaluation model was established. Many examples were provided for the purpose of verification and application, and safety evaluation results were obtained. Similarly, Chengrui Li et al. [
2] analyzed the safety-influencing factors of high-speed railway passenger stations. They established a three-layer safety evaluation index system, constructed a safety evaluation model using a fuzzy comprehensive evaluation method, and conducted a safety evaluation on the example of a high-speed railway passenger station in Northeast China. Honglei Yao [
3] used the analytic hierarchy process to scientifically design a risk assessment index system for railways’ key information infrastructure. Using weight data as the input data for the backpropagation neural network, the risk assessment results were predicted and the risk level was determined. Yaoru Zhang [
4] designed a risk assessment index system for the cooperation of work and electricity with skylight operations and used the analytic hierarchy process to calculate the combined weights of various safety risk factors. The impact of each risk factor on the safety of work and electricity with skylight operations was determined, and corresponding measures were proposed concerning four aspects: personnel, management, equipment, and environment. Xiangyu Sun [
5] used relevant theories and modeling methods, such as the human factors classification and analysis system, logit model, and interpretive structural model, to identify and analyze the main human factors of five railway engineering liability accidents; they also analyzed the hierarchical relationship of the causal factors for the main problems. Kurz S.L. and Milius B. [
6] focused on the risks that could not be ignored in risk assessment by conducting a risk assessment of European railways. Reinach S. and Viale A. [
7] applied the Human Factors Analysis and Classification System (HFACS) accident analysis method to railway accident research and conducted a comprehensive study of human factors in accidents using data on British railway accidents, thus expanding the application of the HFACS to different accident fields. Zhan Q. et al. [
8] developed an improved method for analyzing human factors in accidents based on the structural system of the HFACS accident analysis method, aiming to refine and explore the human and organizational factors in accidents. Lower M. et al. [
9] combined the HFACS accident analysis method with the STAMP error classification method and then, based on the structure of the human factors model, separately controlled the causal factors at each level to identify potential hazards and proposed corresponding solutions.
The railway engineering department is one of the important parts of the railway system since it is responsible for the maintenance and repair of railway lines and related equipment. Its activities mainly include large and medium-sized maintenance and regular maintenance of bridges, tunnels, culverts, roadbeds, rails, turnouts, sleepers, and ballast, thus mostly mechanical operations. With the development of technology, various new technologies and equipment have been steadily introduced into railway engineering operations and management. However, due to various constraining factors, such as uneven worker quality, limited management level, and sudden and severe natural disasters, a large number of railway traffic accidents occur every year. According to the China Railway Yearbook (2021) [
10], there were 138 responsible train accidents, including occupational injuries, in various engineering departments of the China Railway Group in 2020, including 3 major accidents, 8 general Class B accidents, 13 general Class C accidents, and 114 general Class D accidents. The total number of responsible accidents in 2020 increased by 21 compared to 2019.
Accidents occur repeatedly, and any negligence in the management and operation process can easily have severe consequences. Therefore, it is crucial to analyze the accident data, explore the internal mechanisms, potential laws, leading factors, and transmission mechanisms of accidents, and apply these experiences and laws to actual operation management, to improve the overall safety level of the railway operation management system. At the same time, considering the key defects, common deficiencies, frequent emergencies, equipment prone to malfunctions, and operational processes prone to problems, corresponding accident prevention measures should be taken, and efficient response strategies and emergency management measures should be implemented to reduce the occurrence of accidents and risk events. Thus, research in this area is of great significance for improving the overall level of safety management, reducing the number of accidents, ensuring personal and property safety, enhancing passenger satisfaction, and improving the efficiency of train operation organization [
11].
Many existing studies have adopted a single method to analyze safety risks in a single scenario using a single indicator, without considering the current situation of various risk-related factors in a railway system, optimization under multi-objective conditions, and coordination problems with multiple different dimensions of demand [
12,
13,
14,
15,
16,
17,
18,
19,
20,
21]. There has been relatively little research on optimizing the indicator combinations, and some of the existing methods involve subjective evaluations, leaving them unable to determine the logical relationship between indicators and outputs.
To address the aforementioned shortcomings, this paper improves and optimizes the index screening analysis method, and it considers risk correlation factors and optimization under multi-objective conditions. First, the multi-objective particle swarm optimization algorithm is used to determine key risk factors, bringing the indicator screening more in line with the requirements of practical applications. Then, the Bayesian networks are selected, and their structure is optimized to analyze the propagation diagnosis and probability of key risk indicators and obtain the causal logic chain that produces accident results. Finally, the combination of the selected indicators and Bayesian networks is used to obtain more accurate results than those provided by the existing research and fill the gap in research on railway safety risks in risk transmission mechanisms.
2. Model Design
This section mainly introduces the indicators used in the process of model construction, including multi-objective optimization models, Pareto optimal solutions, multi-objective particle swarm optimization algorithms, and Bayesian networks.
2.1. Index Screening Multi-Objective Optimization Model and Pareto Optimal Solution
The principle of selecting key indicators is to remove redundant indicators while ensuring that the accuracy of railway engineering safety status recognition meets the requirements. In this study, the combination of indicators is defined as an independent variable, and
represents the binary code of the indicator combination. This principle can be expressed in the form of multi-objective programming as [
22]
Further, Equation (1) can be expressed in the standard form as follows:
In Equations (1) and (2), represents the accuracy of a machine-learning-based state recognition (classification) algorithm, and and denote the standard forms of and , respectively, where is the state recognition error. Based on the characteristics of machine-learning-based algorithms, there is a contradiction between various sub-objectives of the multi-objective optimization problem mentioned above. Therefore, this study adopts the Pareto optimal solution idea to balance the removal of redundant indicators with ensuring the accuracy of railway engineering safety state recognition.
Definition 1 (Dominant relationship of key indicator combinations). For an indicator combination , if there is and at least one , such that , this combination is called the combination of key indicators and the combination of dominant key indicators , denoted as .
Definition 2 (Combination of non-dominant key indicators). If the key indicator combination is not dominated by any other key indicators, then combination is called a non-dominated key indicator combination.
Definition 3 (Pareto frontier for key indicator screening). The representation of the objective function value set calculated by combining all non-dominated key indicators in the solution space is called the Pareto front of key indicator screening (represented as ), which is expressed as follows: 2.2. Multi-Objective Particle Swarm Optimization Algorithm
This study uses the multi-objective particle swarm optimization algorithm to optimize indicator selection, and the execution process of the algorithm is as follows [
23,
24].
Step 1: Establish the initial population. The solution set used to store individual historical solutions within the population is denoted by , and the solution set used to store the entire population is denoted by . The algorithm iteration time step number is set to the total number of particles, which is denoted by . Binary encoding is performed on a particle in the population, where , , and the corresponding solution to is ;
Step 2: Use the random forest algorithm to train the dataset composed of indicator combinations. The random forest algorithm is selected because it performs well in handling imbalanced classification datasets and has been widely used for state recognition;
Step 3: Use the Euclidean distance to calculate the crowding distance in the entire population solution set , and use the crowding distance to determine the uniformity of the distribution of feasible solutions on the Pareto front. Then, the operations are performed:
- (1)
Initialize the crowding distance of all individuals in set ;
- (2)
For each objective function, sort the individuals in set according to the size of objective function , that is, ;
- (3)
For individuals in set , which are at the boundary, set the crowding distance to infinity, that is, , ;
- (4)
For individuals outside the boundary, from
to
, let
where represents the value of the th objective function of the th individual in set .
Equation (4) indicates that after sorting according to the objective function value, the crowding distance of individuals in set represents the sum of differences between the objective function values of the previous and subsequent individuals.
Step 4: Select a risk-critical causal combination. Based on the crowding distance, the solution in set represents the individual, historical, optimal combination of key causes of risks. In set , the global, optimal combination of key causes of risks is selected, and the selection operator uses the widely used roulette wheel method in optimization algorithms. It should be noted that is the sum of the crowded distances in set . According to the roulette wheel method, the probability of selecting the particles corresponding to group of risk causation combinations is expressed by . The individual corresponding to the key causal combination of the individual historical risk is denoted by , and the selected individual in set is denoted by ;
Step 5: Cross-operation. Convert the binary code
corresponding to an individual
in set
, individual
selected from
, and the current binary code
of particle
to real code and denote their real encodings by
, respectively;
is the probability of crossing. The cross-operation refers to the update of individuals in a population from their parents to their children, and the update process is expressed as follows:
Step 6: Mutation operation. Assume that represents the probability of variation and select particles in a particle swarm with a probability . For a selected particle , randomly select one of its risk-causing combinations corresponding to the binary code. If the bit is zero, it becomes one and vice versa;
Step 7: It can be seen that the newly generated offspring individuals and contemporary individuals jointly form a new population. Continuing to run the iterative algorithm will retain all excellent previous and contemporary particles in the current population. When the risk causal combination screening process meets the termination condition (i.e., the Pareto optimal solution no longer changes), the algorithm ends; otherwise, the algorithm returns to Step 2.
2.3. Bayesian Network
The Bayesian network is a directed acyclic graph with probability annotations, which is a knowledge visualization method based on probability analysis and graph theory. The Bayesian formula, also known as the inverse probability formula or a posterior probability formula, can be regarded as a causal search. Assume that
is a complete event with a positive probability,
is the prior probability, and
is the new additional information; then, the posterior probability is expressed by
In the Bayesian network, nodes represent random variables abstracted from practical problems, and a directed arc between nodes reflects the correlation between them. The emitting node of a directed arc denotes a parent node, the pointed node represents a child node, the node without a parent node is a root node, and the node without a child node is a leaf node. The probability parameter value of the root node is a prior probability, which expresses the objective occurrence of the variable corresponding to the root node. The probability parameters of the other nodes in the network are taken as conditional probability tables, which express the correlation and influence with their parent nodes’ variables. The Bayesian network can be utilized to conduct a qualitative analysis using its network structure (i.e., the causal relationships between events), as well as quantitative calculations using the prior or conditional probability parameters.
Bayesian network structure learning is to process data using certain algorithms to obtain the Bayesian network structure that best fits the given data. In other words, if there is a potential causal relationship between nodes
and
in the data, there must be a directed edge between
and
in the Bayesian network structure. Among numerous scoring algorithms for Bayesian network learning, the K2 algorithm proposed by Cooper G.F. in Ref. [
25] is a greedy search algorithm that uses a scoring function to measure the degree of matching between a dataset and the network structure. The sample size of data used in this study is not particularly large, and according to [
26], the K2 algorithm can effectively learn Bayesian network structures for small datasets in the case of ordered variables. Therefore, this study employs the K2 algorithm to learn the Bayesian network structure of railway engineering safety risks.
Similarly, Bayesian network parameter learning refers to obtaining the network nodes’ parameters, namely the conditional probability table, for a particular Bayesian network structure. The conditional probability table, as a parameter of Bayesian networks, can quantify the strength of nodes’ mutual influence. The research object of this study is the railway engineering safety risk, and the data source represents railway engineering accident cases from 2005 to 2013. By analyzing each accident case and extracting the causes of the risk, the “0–1” matrix is organized as parameter learning data. Since the data related to accident cases can have missing records, and the sample size of the collected data is small, the EM algorithm is selected in this study as a parameter learning algorithm. The EM algorithm can effectively compensate for the impact of missing accident records.
3. Railway Engineering Safety Risk Cause Analysis
This article analyzes 233 accident cases in China’s railway engineering department from various channels between 2005 and 2013, and it explores the causes of risks from four aspects: risk cause types, responsible parties, specific manifestations of risk causes, and accident levels.
In the existing literature, the human–machine–environmental–management safety management system has been widely used in the safety management of various enterprises, and some railway safety fields have seen risk research using this approach. Accordingly, in terms of the types of risk causes, this article classifies and explores those from four aspects: personnel risk causes, facility and equipment risk causes, environmental risk causes, and management risk causes.
3.1. Personnel Risk Cause Analysis
In the railway engineering department, various personnel and scenarios are highly complex. By summarizing and analyzing the 233 collected real cases and accurately distinguishing the objects responsible for accident risk causes, those can be roughly divided into two types: internal railway personnel responsibility (A1) and external railway personnel responsibility (A2).
In terms of internal railway personnel responsibility, the specific causes of risks are mainly related to five aspects: internal railway personnel violate regulations during construction, loading and unloading, and maintenance processes (B1); internal railway personnel have not followed standard specifications for construction, loading and unloading, and maintenance (B2); lack of in-depth learning and understanding of railway-related rules and regulations and inadequate use of mechanical facilities and equipment by internal railway personnel (B3); inadequate supervision and management by responsible internal railway persons or regulatory units (B4); and the responsible internal person or regulatory unit of the railway failed to supervise and manage the system in accordance with standard specifications (B5).
Finally, in terms of external railway personnel’s responsibility, the specific causes of risks are mainly related to six aspects: irregular operations by external railway personnel during construction, loading and unloading, and maintenance processes (B6); external railway personnel have not followed standard specifications for construction, loading and unloading, and maintenance (B7); the learning and understanding of railway-related rules and regulations, mechanical facilities, and equipment usage methods by external railway personnel are not sufficient (B8); inadequate supervision and management by responsible external persons or supervisory units of a railway (B9); responsible external persons or supervisory units of railways fail to supervise and manage the system in accordance with standard specifications (B10); and non-staff outside the railway affect the normal operation of a railway (B11).
3.2. Facility and Equipment Risk Cause Analysis
Considering that the railway engineering department is specifically responsible for the maintenance and upkeep of railway lines and related equipment, including large and medium maintenance and upkeep of bridges, tunnels, culverts, roadbeds, rails, turnouts, sleepers, and tracks, the risk responsibility objects and specific manifestations of facilities and equipment are relatively complex. By analyzing the collected data on 233 real cases, the responsible risk-causing objects may be divided into six types: responsibility for rail line facilities and equipment (A3); responsibility for locomotive and rolling stock facilities and equipment (A4); responsibility for construction and other mechanical facilities and equipment (A5); responsibility for bridge and tunnel facilities and equipment (A6); responsibility for safety protection facilities and equipment (A7); and, finally, responsibility for railway external facilities and equipment (A8).
In terms of responsibility for rail line facilities and equipment, since the track line facilities and equipment mainly include gauge poles, roadbeds, steel rails, road studs, sleepers, turnouts, and signal equipment, for the convenience of summarization and subsequent research, this study selects the status of the main facilities and equipment as a specific manifestation of their risk causes, which mainly includes ten aspects: the construction quality of track line facilities and equipment is poor, and the basic condition is poor (B12); the track rod breaks off or falls off, the number is insufficient, the geometric size exceeds the limit, the height and the curve positive vector exceed the standard, and the curve track is not smooth and round (B13); damage to the roadbed slope, insufficient quantity of ballast, insufficient width of ballast shoulder, and a lack of crushed and loose soil in the roadbed (B14); the strength of the steel rail decreases, cracks appear, fractures occur, and the joint height is misaligned (B15); road studs are floating, skewed, and out of joint, insufficient pressure and torque of fasteners, loosening and failure, weak nail holding force, and missing fasteners (B16); sleeper decay and failure, empty suspension with exposed head, loose upright bolts, and damaged shoulder (B17); loose connecting parts of the turnout, insufficient strength of the structural framework, and failure to pry along properly, which results in geometric dimensions exceeding the limit (B18); rail line facilities and equipment exceeded their service period and were not replaced in a timely manner (B19); due to the faults and other reasons, the track line facilities and equipment violate the limit, affecting the normal operation of trains (B20); and failures of signal facilities and equipment on railway lines, which affect normal train operation (B21).
In terms of responsibility for locomotive and rolling stock facilities and equipment, the specific causes of risks are mainly related to two aspects: the operational status of locomotive and vehicle facilities and equipment is unsatisfactory, and no safety hazards have been detected (B22); and violation of limits by locomotive and vehicle facilities and equipment, which results in damage to track line facilities and equipment (B23).
In addition, in terms of responsibility for construction and other mechanical facilities and equipment, the specific causes of risks are mainly trifold: mechanical facilities and equipment for construction are aging and in unsatisfactory operational conditions (B24); construction and other mechanical facilities and equipment exceeded their service period and were not sent for inspection on time (B25); and construction and other mechanical facilities and equipment encroach on limits, which affects the normal operation of trains (B26).
Further, the responsibility for bridge and tunnel facilities and equipment is related to two types of specific causes of risk: low construction quality of bridge and tunnel facilities and equipment and poor foundation condition (B27); and bridge and tunnel facilities and equipment are aging and are in poor operational conditions (B28).
Beyond this, in terms of responsibility for safety protection facilities and equipment, the specific causes of risk include five aspects: failures in installing the line safety protection facilities and measures according to the specified regulatory requirements (B29); failures in setting up personnel safety protection facilities and measures according to the regulations (B30); safety protection equipment invading the line, which affects the normal operation of trains (B31); aging of safety protection facilities and equipment and poor operational condition (B32); and security monitoring facilities and equipment age and are in a poor operating condition (B33).
Finally, the responsibility for railway external facilities and equipment relates to the following types of risk causes: railway external facilities and equipment invade a line and affect the normal operation of trains (B34); and there is a low quality of construction of external facilities and equipment and a poor basic condition (B35).
3.3. Environmental Risk Cause Analysis
Considering common meteorological factors and the natural disaster types and natural environment involved in railway engineering, and summarizing the data on 233 real cases, the objects responsible for risk causes can be roughly divided into three types: natural weather responsibility (A9); natural disaster responsibility (A10); and natural environmental responsibility (A11).
In terms of the natural weather responsibility, the specific risk causes include thunder, heavy rain, and rainstorm (B36); gale, tornado, and typhoon (B37); heavy fog and haze (B38); heavy snow and blizzard (B39); and high temperature and freezing (B40).
Next, the specific causes of risk related to the natural disaster responsibility include earthquake (B41); floods, mudslides, and landslides (B42); and collapse and falling rocks (B43).
Finally, the main specific risk cause of the natural environmental responsibility is a poor natural environment foundation near the rail line (B44).
3.4. Management Risk Cause Analysis
According to the settings of the railway engineering department in personnel management, facility and equipment management, environmental management, institutional and technical management, and safety risk management, and considering the collected data on 233 real cases, the responsibility for risk causes is divided into five types: personnel management responsibility (A12), facility and equipment management responsibility (A13), environmental management responsibility (A14), institutional and technical management responsibility (A15), and safety risk management responsibility (A16).
First, in terms of the personnel management responsibility, the specific causes of risk are mainly manifested in three aspects: inadequate management, education, and training of internal railway staff (B45); inadequate management, education, and training of external railway employees (B46); and inadequate safety promotion for external railway personnel (B47).
Second, the facility and equipment management responsibility has nine types of specific risk causes: inadequate daily troubleshooting and inspection of track lines and their ancillary facilities and equipment (B48); incomplete or lagging maintenance, upkeep, and acceptance of rail line facilities and equipment, which affects the normal operation of a line (B49); inadequate daily troubleshooting and inspection of locomotive and vehicle facilities and equipment (B50); incomplete or delayed maintenance of locomotive and vehicle facilities and equipment, which affects the normal operation of locomotives and vehicles (B51); inadequate inspection, repair, and maintenance of construction machinery and equipment (B52); failure in supervision and management according to the safety supervision and management system standards for construction machinery, facilities, and equipment (B53); incomplete or delayed maintenance of facilities and equipment, such as bridges and tunnels, which affects the normal operation of a line (B54); inadequate daily inspection and inspection of safety protection facilities and equipment for railway lines (B55); and, finally, inadequate inspection of railways’ external facilities and equipment (B56).
Third, in terms of the environmental management responsibility, the specific causes of risks are mostly manifested in two aspects: insufficient investigation and inspection of the natural environment near a rail line (B57); and problems detected in the natural environment near a rail line are not promptly addressed and effective measures are not taken (B58).
Fourth, for the institutional and technical management responsibility, the specific causes of risks relate to three aspects: relevant departments or personnel have not conscientiously implemented relevant management regulations or systems (B59); inadequate professional technical management in the construction and usage of facilities and equipment, fault handling, and accident rescue (B60); and mismatch between the construction organization plan, construction effect, and actual demand (B61).
Finally, in terms of the safety risk management responsibility, the specific causes of risks include five aspects: the safety supervision and management system is not sound (B62); failures in performing the supervision and management according to the safety supervision and management system standards (B63); the safety risk prevention and management system is not sound (B64); insufficient attention to potential security risks (B65); and a failure in drawing inferences about similar safety risk issues (B66).
3.5. Accident Level Indicators’ Extraction
The accident level index is divided into four levels according to the suggestions given in Chapter 2 of the “Rules for Investigation and Handling of Railway Traffic Accidents in China” (Order No. 30 of the Ministry of Railways in 2007): particularly serious accidents (T1), serious accidents (T2), major accidents (T3), and general accidents. The general accidents are further divided into general class A accidents (T4), general class B accidents (T5), general class C accidents (T6), and general class D accidents (T7).
6. Risk Analysis of Railway Engineering Safety Based on Bayesian Network
6.1. Bayesian Network Structure Learning
This study used the K2 algorithm to learn the Bayesian network structure since it is effective for small datasets. The GeNIe 4.0 Academic software was used to import the railway engineering safety risk-critical cause data matrix, and the Greedy Think Thinning algorithm that comes with the software was employed to learn the Bayesian network structure from that matrix. Due to the exponential growth of the conditional probability of a node with the number of parent nodes, the maximum number of parent nodes was set to eight. In addition, by limiting the maximum number of parent nodes, we ensured that the algorithm would not search for the solution infinitely. Finally, one invalid node was removed, and the initial Bayesian network structure for risk-critical factors of railway engineering safety was output, as shown in
Figure 2.
6.2. Bayesian Network Parameter Learning
The data samples used in this study were sourced from real accident reports, but due to various different factors, the information contained in the accident reports was not necessarily uniform, and some information might have been missing. Therefore, the EM algorithm was employed to compensate for a small number of missing fault reports, improve the convergence speed for the small sample size, and reflect the probability values between nodes accurately. By using the parameter learning algorithm provided by the GeNIe 4.0 Academic software, each network node was matched with the corresponding column in the fault data matrix; the values of zero and one in the data matrix corresponded to States 0 and 1 of a node, respectively. State 0 indicated that the risk cause did not exist, and State 1 indicated that the risk cause existed. By setting the EM algorithm and selecting randomized initial parameters as the E-step starting parameters of the EM algorithm, the conditional probability table of the Bayesian network was obtained. The EM algorithm was selected because it has good applicability to data that might contain hidden variables. For example, when learning this type of data, using randomized parameters can effectively avoid the problem of local maximum values compared to using uniformly distributed parameters. The final generated probability map of risks’ key causal conditions in railway engineering safety is shown in
Figure 3.
6.3. Occurrence Probability and Ranking of Risks’ Key Causes
Based on the Bayesian network structure for the key causes of railway engineering safety risks established in
Section 6.1, the edge probabilities of 20 risk-critical factors were calculated. The edge probability calculation results were extracted and sorted from high to low according to the probability of occurrence of risk-critical factors, as shown in
Table 4.
According to the sorted results, the probability of failure in supervising and managing the system in accordance with the defined standards for a safety supervision and management system (27.90%) was the highest among all the risk causes. Thus, the railway engineering department should enhance safety prevention awareness, improve safety supervision and management, increase the supervision and management level, and avoid safety accidents caused by non-standard management.
In addition, it is necessary to pay additional attention to four aspects. First, internal railway personnel did not follow standard specifications for construction, loading and unloading, and maintenance (25.32%). Second, internal railway personnel violated the regulations during the construction, loading and unloading, and maintenance processes (21.03%). Third, the construction and other mechanical facilities and equipment encroached on limits, which affected the normal operation of trains (14.80%). Fourth, the supervision and management operations performed by the responsible external personnel and supervisory units of the railway were inadequate (14.59%). Therefore, more attention should be paid to the training process of internal and external personnel of the railway to enhance their awareness of safety precautions, strengthen the management of professional skills, and show them how to perform standardized operations according to the relevant regulations, thus preventing violations of personnel operations and poor facilities and equipment.
6.4. Posterior Probability Analysis of Risks’ Key Causes
By using the Bayesian network diagnostic reasoning function of the GeNIe 4.0 Academic software, the posterior probability of the parent node was calculated for various risk-critical causes. The calculation results are shown in
Table 5.
The calculation results obtained by reverse reasoning provided information on the probability of the specific manifestation of each risk cause occurring at its parent node risk cause. Based on this information, the most likely cause of the specific manifestation of the risk cause was identified. This will enable the railway engineering department to learn from lessons, draw analogies, and achieve precise management.
When the gauge rod was broken, falling off, or insufficient in quantity, the geometric size was beyond the limit, the height and curve vector were also beyond the limit, or the curve track was not smooth (B13), then, consequently, the probability of the railway internal personnel not following the standard specifications for construction, loading and unloading vehicles, and maintenance (B2) was high, reaching 73.41%. Based on the results, the status of facilities and equipment was closely related to whether internal personnel operated in a standardized or non-standardized manner. Therefore, it is necessary to improve the training process of internal railway personnel, focus on inspecting various facilities and equipment in accordance with the standard specifications, and prevent the occurrence of safety accidents caused by non-standard operations, which can lead to damage to various facilities and equipment.
Further, when construction and other mechanical facilities and equipment violated the limit, which affected the normal operation of trains (B26), the probability of failure to perform the supervision and management process in accordance with the safety supervision and management system standards (B63) was 56.05%. Thus, the status of facilities and equipment was also affected by management. Therefore, the railway engineering safety supervision and management system standards should be strictly implemented, and possible problems in facilities and equipment should be carefully investigated to prevent faults that might affect the normal operation of trains.
Finally, when some specific manifestations of risk causes occurred, there was a certain probability that the rail line facilities and equipment had exceeded their service period and were not replaced in a timely manner (B19), and earthquakes (B41), floods, mudslides, and landslides (B42) could occur. Therefore, railway engineering departments should pay special attention to the risk prevention process under the conditions of natural disasters and adverse weather.
6.5. Transmission Mechanism and Probability Analysis of Risks’ Key Causes
According to the Bayesian network structure of the railway engineering safety risk-critical causes designed in
Section 6.1, this study defined B
2 and B
42 as the initial risk-critical causes with State 1, indicating the occurrence of risk-critical causes. Their sub-nodes were studied separately, the transmission path of risk key causes was determined, the probability of the occurrence of the final risk key cause was calculated, and the impact of the occurrence of the initial risk key cause on the final risk key cause was evaluated and quantified. The results of the positive transmission path and probability of occurrence are shown in
Table 6.
When natural disasters, such as floods, mudslides, and landslides (B42), occurred, the probability of the security monitoring facilities and equipment being aged and in a poor operating condition (B33) was the highest, having a value of 33.20%.
In addition, the following factors also had a significant impact: the violation of limits by locomotive and vehicle facilities and equipment resulted in damage to track line facilities and equipment (B23); construction and other mechanical facilities, and equipment that exceeded their service period but were not sent for inspection on time (B25); there was inadequate safety publicity for external railway personnel (B47); and there was a failure to take timely and effective measures for problems detected in the natural environment near the track line (B58).
Further, the violation of limits on locomotive and vehicle facilities and equipment, which could cause damage to track line facilities and equipment (B23), overdue servicing of construction and other mechanical facilities and equipment, or failure to submit for inspection on time (B25), inadequate safety promotion for external railway personnel (B47), and problems detected in the natural environment near the rail line that were not promptly addressed or met with effective measures (B58) each had a significant impact.
Finally, when the internal railway personnel did not follow standard specifications for construction, loading and unloading, and maintenance (B2), floods, mudslides, landslides (B42), and failure to supervise and manage the system according to safety supervision and management system standards (B63), this caused multiple facility and equipment risk factors and management risk factors. In particular, natural disasters, such as floods, mudslides, and landslides, had a strong impact on facilities, equipment, the environment, and management. In recent years, several major railway safety accidents in China have occurred under natural disaster conditions, such as floods, mudslides, and landslides. Therefore, railway engineering departments should put additional efforts into preventing various safety risks and managing facilities and equipment during the rainy season and flood season.
7. Discussion
Based on a review of existing studies in the field of railway safety, it seems that most scholars have adopted a single method for analyzing safety risks under a single scenario or indicator, without considering the current situation of various risk correlation factors in the railway system, or considering optimization under multi-objectives and the coordination of multiple different dimensions of demand. Moreover, there have been few studies on the optimization of the index combination itself. And some methods used in subjective evaluations cannot judge the logical relationship between indicators and outputs.
Liu Yang [
12] proposed and improved relevant text feature extraction methods, established an expert database, and considered text correlation as a means to determine key accident cause factors. They also identified the accident chain in the accident-cause-related network, uncovered the internal relationships between the factors, and found the key accident chain and cause factors. Zong Zhou [
13] applied and improved the Domino model in railway operation safety risk identification, calculated the similarity of a railway operation risk network based on Vague set theory, explored and visualized association rules for the causes of railway operation risks in the UK, and determined the scope of railway operation risks in the UK from 2010 to 2019. By improving the FRAM model, Yifei Xu [
14] proposed an accident cause analysis method for the railway transport system based on the ISM-IFRAM model, analyzed the causes of China’s “7.23” Yongwen Line (especially major railway traffic accidents), and offered relevant improvement suggestions on a resource allocation basis for the railway transport system based on the analysis results. Shanshan Wang [
15] quantitatively analyzed nodes and community risks in the network, built a railway accident-cause-related network model, identified and analyzed the accident cause chain in the network, and analyzed the railway accident cause network under dynamic weights. Xinsheng Shan [
16] built an HFACS model suitable for safe production in power supply scheduling based on the statistical data of power supply scheduling safety information of the China Railway Guangzhou Bureau in the past three years, and designed a method for expert evaluation and fuzzy comprehensive calculation to be used to conduct a quantitative analysis of the membership degrees of factors at each level of the model. Weijian Cui [
17], utilizing part of China’s railway network security assessment data from 2019 to 2020, adopted a railway network security risk analysis method based on an improved Apriori algorithm, and they conducted application research on the whole process of risk analysis such as data collection, preprocessing, mining, and calculation. Yuchen Luo [
18] established the HFACS model framework for improving the personal safety accidents of Chinese railway public servants, classified and elaborated the factors that caused the personal safety risks of public servants, analyzed the factors that caused the risks, and evaluated the accident model data using the set pair analysis theory model. Yu Gao [
19] applied the constructed LTE-R risk assessment system and ranking model to the ranking field of LTE-R system risk factors, and they verified the effectiveness of the model in the field of rail transit by testing the sensitivity of the MAIRCA algorithm under different weights and comparing the ranking results of various algorithms. Based on the SPA-TOPSIS model, Xiaoxu Shen [
20] proposed a multi-perspective evaluation method for railway transport safety, conducted case evaluations of railway transport safety from regional perspectives and line-specific perspectives, analyzed and calculated the values and fluctuations of evaluation indicators under different perspectives, analyzed the evaluation results, and gave corresponding decision-making suggestions. Shi Zhang [
21] used the fuzzy interpretation structure model to build the fuzzy ISM model of the accident risks of the train service system, and they used the D-S evidence theory and Bayesian network to build an accident prediction model of the railway train service system, and combined with the actual train service system data, they conducted accident prediction and analysis for the train service system. In the above studies, most of the analysis methods of railway safety risks used by scholars were relatively simple. Although some scholars combined multiple analysis methods and put forward their own innovations in the research, they did not analyze the correlations between safety risks, resulting in certain deficiencies.
With the development of science and technology, new technologies and new equipment are constantly introduced into railway operation and management. Nonetheless, the railway safety situation will be related to various factors such as the uneven quality of workers, the limited management level, and sudden and severe natural disasters. How to study the existing railway accident data, excavate the internal mechanisms, potential laws, leading factors, and transmission mechanisms of the accidents, and how to apply the subsequent discoveries and experience to actual operation management, so as to improve the overall safety level of the railway operation management system, has been the content of this paper.
Firstly, the method of multi-objective index screening was used in this paper to make index screening more suitable for practical applications. Secondly, a Bayesian network was used, the structure of the Bayesian network was optimized, and the causal logic chain of the accident result was obtained. Then, based on the research results of this paper, we analyzed the impacts of personnel risk, facility and equipment risk, environmental risk, and management risk, respectively, and those risks combined. More precisely, we analyzed the causes of accidents and the relationships between them, and, in doing so, we made up for the lack of previous research on railway safety risks in the study of risk transmission mechanisms.
8. Conclusions
This study has analyzed real cases of railway engineering accidents in China, using the multi-objective particle swarm optimization algorithm to determine the key risk factors. Bayesian networks have been used to analyze the transmission mechanisms and probabilities of key risk factors and obtain the causal logic chain that produces accident results. Finally, it has been shown that combining the indicators and Bayesian networks can improve the accuracy of risk prediction and provide more accurate results than using existing research, and it can fill the gap in research on railway safety risks in risk transmission mechanisms.
Based on the results, the following conclusions are drawn.
From the perspective of personnel risk causes, the probabilities of internal railway personnel not following standard specifications for construction and violating regulations were as high as 25.3% and 21.0%, respectively. It is necessary to ensure effective training is given for various types of railway personnel and that we enhance their safety awareness, focus on the various operations of internal railway personnel, and avoid the occurrence of facility equipment failures and safety accidents caused by illegal and non-standard operations.
In terms of facilities and equipment risk causes, failures of facilities and equipment are closely related to the non-standard operation and supervision/management of personnel, and the probability of their occurrence reached 73.41%. We also found that these risk causes can easily serve as an intermediate node in the transmission path of key accident causes. Therefore, it is necessary to improve the operation and supervision/management of various railway personnel. Moreover, additional attention should also be paid to the usage status of facilities and equipment to avoid the problems of overdue services or lack of timely replacement.
In addition, regarding environmental risk causes, natural disasters, such as floods, mudslides, and landslides, are prone to causing nine chain reactions, including equipment failures in railway lines and other facilities and an inadequate safety supervision and management system, which can easily lead to safety accidents. Therefore, railway-related departments should improve relevant safety management measures according to the actual situation to reduce the probability of occurrence and transmission of related risks during severe natural disasters.
Further, in terms of management risk causes, note that the probability of failure to perform supervision and management according to the standards for the safety supervision and management system was found to be 27.90%, and this was shown to have a significant impact on the other risk causes. Thus, railway-related departments should give high importance to the management of personnel, facilities, equipment, and the environment to improve relevant safety supervision and management systems and promote the management to play its due role.
The limitation of this study is that due to the influence of relevant policies, actual accident cases have relatively long occurrence times, which reduces the reference value for the current railway engineering department. Moreover, in the process of extracting risk indicators, due to an insufficient understanding of the railway engineering department, the classification of individual risks might not have been precise enough, which could have impacted the screening of key causes. To overcome this, future research could refine risk classification, enhance the correlation between the risk causation responsibility objects and specific manifestations of risk causation, and promote more accurate transmission diagnosis and probability analysis, stronger causal logic, and practicality.