1. Introduction
With the development of society and the increase in population, there are increasingly more natural or man-made emergencies, such as fires, earthquakes, stampedes, and so on. These emergencies often occur in metro stations, large shopping malls, and other densely populated areas. If the emergency evacuation is not handled correctly, it can result in a large number of casualties and property damage, so how to do a good job of emergency evacuation in emergency situations has become an important part of research in safety science.
Researchers have already started to study the evacuation behavior of people in the early 1900s. Initially, the study of evacuation behavior was mainly conducted through surveys, experiments, and observations. Hankin and Wright found a relationship between speed, density, and flow of unidirectional pedestrian flow by investigating pedestrian flow in London underground passages and elaborating a preliminary pedestrian theory [
1]. Bryan [
2] and Wood obtained data on the behavioral responses of people escaping from fires through a survey of people escaping from fires. Pauls observed the activities of people at events by means of field observations and summarized the relevant patterns [
3]. Zhang Shuping [
4] obtained a conversion factor interval for the number of evacuees in the business part of a large shopping mall building by measuring and statistically analyzing the actual flow of people in the mall. Zhong Guangchun et al. conducted a questionnaire survey on safety awareness and evacuation in an earthquake disaster situation at a university in Nanjing and found that males were more familiar with evacuation sites than females, but the proportion of males waiting in line to pass through evacuation routes was smaller than that of females [
5]. Guo Summer et al. investigated the evacuation behavior of people in the underground by means of a questionnaire and found that whether people in the underground chose to evacuate or not was related to education and age, but not much to gender [
6].
After the 1980s, based on the rapid development of computers, people began to build computer models to describe the behavior of people on computers and carry out regularized research [
7]. Fang et al. [
8] proposed a spatial grid-based evacuation model, SGEM, and used the model to simulate the evacuation of people. They concluded that the network grid composite model could not only simulate the evacuation of large and complex buildings but also visualize the detailed evacuation process. Song Weiguo et al. [
9] proposed a lattice gas model for evacuating people, taking into account the characteristics of the population distribution. The model not only predicts the interval value of the number of evacuees, but also derives the time required to evacuate a certain number of people, thus obtaining a quantitative relationship between the number of evacuees and evacuation time. Secondly, the model can also analyze the quantitative impact of the spatial distribution of people on the evacuation outcome and its uncertainty. Wang Yiheng et al. [
10] established an emergency evacuation index system through a BP neural network, conducted a simulation evaluation with the underground stations around Beijing’s North Third Ring Road as the main research object, and proposed an improvement scheme. Yang Zhaosheng et al. [
11] proposed an improved particle swarm optimization algorithm model based on ranking selection through an improved particle swarm algorithm, which can effectively evacuate vehicles in public places in an efficient and safe manner in emergency situations. Duan Xiaohong et al. [
12] shortened the transit time of emergency vehicles through the bat algorithm, effectively improving the efficiency of emergency rescue, while the algorithm has good search capability and operation speed. Based on the social forces model, Cheng Yao et al. [
13] used simulation to study the pedestrian emergency evacuation problem and experimentally demonstrated four factors that affect the evacuation process. The four factors include pedestrians’ familiarity with the layout of the place, pedestrians’ grouping behavior, the internal layout of the place, and the safe carrying capacity of the place. Qi et al. [
14] demonstrated that all these non-adaptive evacuation behaviors have some negative impact on pedestrian evacuation efficiency by simulating non-adaptive evacuation behavior pairs such as inertial behavior, folding behavior, herding behavior, and partnering behavior of pedestrians. Cao Siqi et al. [
15] used Dijkstra’s algorithm to calculate the optimal evacuation path under a fire scenario and demonstrated that the method is consistent with the actual evacuation road situation and can effectively determine the optimal evacuation route for traditional village complexes under fire. S. Peeta et al. [
16] proposed an FL model based on fuzzy logic to cope with the heterogeneous behavior of evacuees after disasters such as earthquakes and chemical plant explosions, resulting in emergency evacuation measures due to large uncertainties. Khalid A. Albis et al. [
17] proposed a model for sudden fire evacuation of large shopping malls through fire dynamics simulation. Khalili-Damghanin et al. [
18] proposed a hybrid mathematical planning model for uncertain multi-objective, multi-commodity, multi-cycle location assignment using a multi-objective optimization and location assignment model. Amany et al. [
19] developed an algorithm using the CFAST model to generate the shortest evacuation route in the form of a clear tree diagram in the shortest evacuation time.
On the basis of pedestrian traffic modeling, with the continuous improvement of computer models, a variety of commercial pedestrian traffic simulation software has emerged, including Legion, Vissim, Simwork, Anylogic, etc. The improvement of models and the development of simulation software have enabled the emergency evacuation problem to be solved to a certain extent, but in the process of extracting information about evacuation targets, most of the data acquisition methods used by research scholars are non-machine vision methods to acquire data, including the use of broadband satellite networks, wireless local area networks (WLAN), Bluetooth sensors, etc.
Although better research results have been achieved with non-machine vision-based methods, they are difficult to widely promote and apply due to the many uncertainties in the data acquisition process. Therefore, this paper, based on previous research, applies the YOLOv5 target detection algorithm to the emergency evacuation problem for the first time, with the aim of using the stronger detection capability of YOLOv5 to compensate for the shortcomings of traditional methods in the process of data acquisition. This paper also uses YOLOv5 in combination with Anylogic simulation software to input the pedestrian locations detected by YOLOv5 into Anylogic for emergency evacuation simulation and to simulate a suitable evacuation plan. This article uses Anylogic version 8.5.0. The method proposed in this paper has the advantage of faster data information collection than traditional methods and reduces the overall process time for emergency evacuation by reducing the time used in the data information collection process. In an emergency situation where time is of the essence, making a quick decision and choosing the most appropriate solution can greatly reduce the damage to people and property.
The remainder of the paper is structured as follows:
Section 2 outlines the structure of the YOLOv5 target detection network and the improvement strategy of this paper, together with an introduction to the Anylogic simulation software.
Section 3 outlines the experimental procedure, parameter configuration, and evaluation metrics, and analyzes the experimental results of the method in this paper. Finally,
Section 4 concludes the paper and presents future research plans.
2. Materials and Methods
2.1. Analysis and Design of the YOLOv5s-SE Algorithm
The YOLO [
20,
21,
22,
23] (You Only Look Once) family of target detection algorithms accomplishes target location and classification by using direct prediction of the target’s bounding box, which has the benefit of increasing detection speed. YOLOv5 is an improvement of the YOLO series algorithm [
24], YOLOv5 builds on the YOLOv4 target detection algorithm by using auto-scaling, cropping, and mosaic for data enhancement, adding automatic learning of the size of the anchor box. The network structure of YOLOv5 is divided into three parts. The first part is Backbone, whose role is mainly feature extraction; the second part is Neck, whose role is to mix and combine features and pass these features to the prediction; and the third part is Head, responsible for the final prediction and the output of the prediction results. In tests on the official dataset, YOLOv5 has shown some improvement in detection speed and accuracy over YOLOv4, and a nearly 90% reduction in model size compared to YOLOv4 [
25]. The YOLOv5 algorithm is divided into YOLOv5s, YOLOv5m, YOLOv5l, and YOLOv5x according to network depth and feature map width from small to large, where the depth of the YOLOv5s model is 1/3 of YOLOv5l and the width of the feature map is 1/2 of YOLOv5l. Since this paper is to achieve the detection of pedestrians under emergency situation conditions, the importance of detection speed is greater than detection accuracy, so YOLOv5s, which has a faster detection speed, is chosen as the object of study in this paper.
Although YOLOv5s has a fast detection speed, the detection results for small target groups are not satisfactory. As the YOLOv5s network level continues to deepen, the information extracted at the output becomes increasingly abstract, and the detection of distant pedestrian heads inside the field station becomes increasingly difficult to achieve. Therefore, in order to enhance the accuracy of YOLOv5s for small target group detection, this paper incorporates SENet [
26] into the YOLOv5s network structure, and the improved network structure is shown in
Figure 1.
On the input side, YOLOv5s adopts the Mosaic data enhancement technology, which can solve the problem of unsatisfactory training results and model overfitting caused by insufficient sample data to a certain extent. Secondly, it introduces the adaptive anchor frame calculation, which can obtain prediction frames based on predetermined a priori frames during network training. Thirdly, it adds the adaptive image scaling mechanism, which effectively improves the model training. Finally, an adaptive image scaling mechanism is added to effectively improve the model’s training speed. In the feature extraction backbone network, YOLOv5s uses the Focus structure and Cross Stage Partial [
27] (CSP), and two different CSP: CSP1_X for the feature extraction network and CSP2_X for the feature fusion network. The Neck network layer of YOLOv5s consists of Feature Pyramid Networks [
28] (FPN) and Path Aggregation Networks [
29] (PAN), which deliver richer semantic and localization information from top to bottom and bottom to top, respectively, and then fuse features from different backbone layers to different FPN and PAN to deliver richer semantic and localization information from top to bottom and bottom to top, respectively. The prediction side of YOLOv5s consists of three detectors corresponding to three different scales of feature maps, large, medium, and small, for target detection, and the output prediction results correspond to target frame coordinates, confidence, and category information, respectively.
In YOLOv5s, the SPP [
30] (Spatial Pyramid Pooling) module can further extract features by serial pooling to enhance the deep feature representation capability of the backbone network and improve the perceptual field of the model. In this paper, to increase the accuracy of YOLOv5s for distant small target detection, SENet is added before SPP, which allows the model to obtain more global features of the target. The structure of the SENet network is shown in
Figure 2.
Excitation uses a fully connected neural network to perform a non-linear transformation on the result of Squeeze. Feature rescaling uses the results obtained by Excitation as weights to be multiplied by the input features. By incorporating the SE attention mechanism into the YOLOv5s network structure, it is possible to better enable YOLOv5s to focus on important features and suppress general features, improving the accuracy of pedestrian head detection. The detection results are then fed into Anylogic simulation software to simulate the pedestrian evacuation route after the detection is complete.
In order to further improve the accuracy of the network model for small target object detection, this paper also introduces Normalized Wasserstein Distance [
31] (NWD) as a new metric. The original metric of YOLOv5 is IoU, based on the fact that IoU and its extensions are very sensitive to the position deviation of small target objects, which severely degrades the detection performance when used in the Anchor-based IoU and its extensions are very sensitive to the position deviation of small targets, which severely degrades the detection performance when used in Anchor-based detectors. The NWD metric can be easily embedded in the Assignment, Non-Maximum Suppression, and Loss functions of any Anchor-based detector to replace the commonly used IoU metric. However, in practice, we have found that replacing the NWD completely with the IoU metric can lead to too slow a convergence of the model. To avoid this, instead of replacing the IoU completely with NWD as the new metric, this paper uses NWD and IoU together and sets up a scaling relationship where the weight of using NWD and IoU can be changed by modifying the size of the scaling relationship. In this paper, it is proposed to further improve the accuracy of the network model for small target detection while avoiding too slow a convergence.
2.2. Anylogic Emergency Evacuation Modelling
In many practical cases of emergency evacuation, the efficiency of pedestrian evacuation is often influenced by two factors, namely the human factor and the external environment.
Human factors include human characteristics and psychological factors. These factors can have an impact on the speed of pedestrians, for example, men generally walk faster than women, young people generally walk faster than older people, and pedestrians in good health generally walk faster than those in poor health. Human psychological factors can also have an impact on the evacuation process. When people encounter a sudden dangerous event, their psychological state undergoes a relatively large change. The psychology of panic can lead pedestrians to make irrational choices in the event of an emergency. In an emergency, pedestrians may be unfamiliar with the interior of the premises and may be tempted to congregate with the crowd, which can cause serious overcrowding and greatly affect evacuation efficiency. When people help each other in the evacuation process, it will mostly have a positive impact on the evacuation efficiency, but if people hold a competitive mentality, there may be a stampede during the evacuation process due to the rush to escape, etc. Inertia refers to the fact that pedestrians are more familiar with the layout of the interior of the station and tend to choose routes that they normally walk more often or are more familiar with when evacuating. In addition, the choice of evacuation routes and the acceptance of evacuation instructions by the crowd are also related to their personalities.
External factors that influence evacuation efficiency are mainly the characteristics of the building and the internal lighting, announcements, and evacuation guidance signs. Building characteristics include the building’s fire detectors, fire-fighting facilities, emergency evacuation routes, and the width of emergency evacuation routes. The location and brightness of the emergency signage will have a great impact on the efficiency of the evacuation process. Reasonable and effective evacuation signs will quickly help people find their way out and escape to a safe area.
In order to meet the software requirements of this study, Anylogic simulation software was chosen to simulate the emergency evacuation process of pedestrians. AnyLogic is a multi-method simulation modeling software developed by XJ Technologies [
32]. The software supports a variety of modeling methods, such as intelligent body modeling, dynamic system simulation, discrete events, and system dynamics [
33]. Due to its open and flexible modeling environment, the software applications cover a wide range of areas, such as transport, logistics, control systems, the military, logistics, and education. AnyLogic allows the observation of system behavior over time at any level of detail, provides for increased accuracy and more precise forecasting, and can be animated in 2D/3D so that it can be more easily verified [
34]. The AnyLogic software package is a powerful platform that has a developed pedestrian library and many methods to collect the statistical results of a simulation, making it is easy to implement the agent approach completely [
35].
The Anylogic simulation modeling process is divided into three main parts. First, the physical model is built. The physical model is built to match the actual layout of the simulation environment, so it is necessary to import a base map of the building’s layout when drawing the physical model. It is important to note that Anylogic has its own scale, which can be set to suit one’s needs, and Anylogic’s pedestrian library model has corresponding spatial markers to help draw the graphics. As a multi-floor simulation environment is to be built, multiple layers are created to represent the different floors, with the height difference between floors matching the actual situation. Next, the people’s behavior flow is set up. Different modules are selected from the pedestrian library to represent the logical flow of pedestrians as required, the modules are matched to the corresponding spatial markers, and the parameters of the modules are determined as required. Finally, the simulation parameters are set. The parameters are set according to the actual situation of the people in the simulation object, such as the proportion of people’s gender and age, and the corresponding comfortable speed is set according to the different groups of people. In order to realize the evacuation function, events need to be set, functions need to be called to realize the emergency evacuation function, etc.
This study proposes to simulate the evacuation route of pedestrians in an emergency situation through Anylogic, with the aim of ensuring that pedestrians choose the nearest route and are able to leave the current scene as soon as possible and safely. This paper proposes to create an underground station scenario through Anylogic and divide it into two levels, as shown in
Figure 3, the ground-level layout and the negative-level layout of the station, respectively.
As shown above, the ground floor of the station provides pedestrian access and ticketing services, while the negative level of the station is the main passenger area for pedestrians. At the same time, the ground and negative levels are connected by stairs and lifts to enable pedestrians to enter and exit the station. Once the metro station model has been created, the behavioral parameters of the people entering the station need to be set. The main modules used include PedSource, PedService, PedGoTo, PedWait, PedEnter, PedExit, and PedSink. Anylogic’s general pedestrian library module is introduced as shown in
Table 1.
In this article, we generate a certain number of pedestrians through PedSource and set the age, gender, and comfort speed of the person in the Person smart body. Under normal circumstances, pedestrians entering the mall will go to the appropriate area on the appropriate floor to purchase tickets, queue for trains, etc. Which floor or area the pedestrians go to requires the use of the SelectOutput module for selection, which can be conditioned or probabilistic according to the actual situation. The PedSource or PedWait modules can be used to simulate this process when the pedestrian arrives at the corresponding area to stay in the normal behavioral flow. The service time or delay time can be set in the module to simulate the time that the person receives the service or stays here, and PedWait is used in this paper to indicate that the person stays in the corresponding area of the metro station to purchase tickets, check tickets, and queue for trains.
4. Conclusions and Future Work
In this paper, we propose an emergency evacuation scheme based on a combination of YOLOv5s-SE target detection and Anylogic simulation. We first detect the head of a pedestrian inside a yard station by using the YOLOv5s target detection network. We added the SE attention mechanism to the YOLOv5s network and used both NWD and IoU metrics together to not only avoid the model converging too slowly, but also to improve the accuracy of the model for small target detection. To demonstrate the advantages of the improved YOLOv5s network, we compared the improved network with other advanced target detection networks, and the experimental results showed that the improved network model has higher detection accuracy and faster detection speed. Once the location of the pedestrians was determined, they were fed into Anylogic’s pre-built emergency evacuation model for simulation. The simulation results showed that pedestrians inside the yard station would eventually leave at the exit nearest to them, and those who were not on the ground floor would first reach the ground floor by the stairs nearest to them, and then follow the exit nearest to them. In practice, the simulation results can be communicated to pedestrians by means of screens, announcements, and staff inside the yard, guiding them to follow the simulated emergency evacuation route. The Anylogic-based emergency evacuation model is easy to modify, allowing staff to adapt the model to the actual situation on site and quickly simulate a new evacuation route. In summary, both the YOLOv5s target detection model and the Anylogicy emergency evacuation model are fast enough to provide a suitable solution within a short period of time after an emergency event and are well suited to the emergency evacuation of pedestrians within a yard.
Although the proposed method is able to detect pedestrians inside the station and simulate a reasonable emergency evacuation route, there are still areas for improvement. In terms of target detection, due to the complexity of the actual situation inside the station, there are still some missed detections. In terms of emergency evacuation simulation, the method only achieves the shortest distance for pedestrians under reasonable circumstances but does not take into account the impact on evacuation caused by unreasonable behavior of pedestrians due to fear and other psychological factors during actual emergencies.
In future research, on the one hand, the number of target detection datasets needs to be expanded, and a better-performing target detection model needs to be trained through continuous attempts to improve the network structure. On the other hand, the psychological factors of pedestrians, personality factors, and interactions between pedestrians will be taken into account in the emergency evacuation model to further improve the structure of the model and simulate the emergency evacuation process more realistically.