1. Introduction
Natural disasters can take a long time to threaten people’s safety and affect the order of production [
1,
2]. They are characterized by their widespread nature and strong destructiveness, and common natural disaster events include earthquakes, floods, and typhoons. In recent years, the most influential natural disasters were the Tangshan earthquake in 1976, the Wenchuan earthquake in 2008, the Chile 9.5 magnitude earthquake in 1960, the ‘JiangHuai flood’ in 1991, the ‘catastrophic flood’ in 1998, the ‘flood disaster’ in 2020, the ‘7.20 heavy rain’ in Henan Province in 2021, the ‘9.16 earthquake’ in Luxian County in 2021, Typhoon Haiyan in 2013, Typhoon Haima in 2014, Typhoon Meranti in 2015, Typhoon Haikui in 2023, and so on [
3]. The total number of people affected by disasters in the world between 2000 and 2022 was almost 4.5 billion [
4]. The occurrence of natural disasters can further lead to various types of disaster events, forming a chain of disasters with serial effects and severe destruction [
5]. At present, disaster management mainly focuses on the monitoring and management of primary disasters such as floods, earthquakes, storms, wildfires, and building fires [
6]. Secondary disasters further derived from disasters are also destructive [
7]. For example, after the Wenchuan 8.0 earthquake in 2008, many houses were seriously damaged and collapsed in succession over a long period. The 7.3 magnitude Hanshin earthquake in Japan in 1995 caused large casualties, and gas leakage led to serious fires [
8,
9]. After the occurrence of natural disasters, the on-site environment is unstable, which can easily lead to further disaster events and thus form a disaster chain. For example, when a natural disaster occurs, a fire caused by a short circuit of the electrical system in the environment can easily lead to further disaster events [
10]. If there are flammable and explosive substances in the environment, they can form combustion and explosion events under an open flame. In addition, if there is toxic material leakage in the environment and it is not identified and treated in a timely manner, if trapped persons are not promptly found and treated, and if damaged houses that are threatening to collapse are not provided with necessary early warning, it will lead to further disaster events, resulting in an even more serious threat to life and property [
11]. After natural disasters, secondary disaster events that may occur include primary fire, flammable material burning, explosive material explosion, toxic material leakage, injured rescue, and building collapse.
To investigate the correlation between the types of secondary disasters and common everyday items, in this study, we establish six secondary disaster factor datasets, namely, fire, flammable objects, explosive objects, toxic substances, trapped persons, and dangerous buildings, of which fire is the first type of secondary disaster factor, and the other five are the second type. The formation rules of disaster events after natural disasters occur are part of the analysis that uses the computer vision method. A CDMV (Class Decision making by Models Vote) decision method for improving the AP value is constructed. Based on this, the ResNet-CDMV is formed by adding the single ResNet model [
12] training method, which realizes the classification and recognition of multiple objects in an image. The monitoring and early warning of disaster events are of great significance.
The rest of this article is arranged as follows:
Section 2 introduces the relevant research of secondary disaster monitoring and early warning. The proposed framework is explained in
Section 3. The experimental results are presented in
Section 4.
Section 5 discusses the scientific nature of the framework proposed in this article. Finally, the research is summarized in
Section 6.
2. Related Studies
In this study, the secondary disaster factors refer to common objects or phenomena in life that can cause six kinds of disaster events, namely, fire, flammable objects, explosive objects, toxic substances, trapped persons, and dangerous buildings, especially fire in a natural disaster site, which worsens the other five kinds of secondary disaster events [
13]. Therefore, after a natural disaster, the identification of fire and the five secondary kinds of disaster events is the key work that needs to be completed. In this section, we summarize and analyze the current research status of the methods of reducing secondary disasters.
In terms of fire monitoring methods, Xu et al. [
14] used the YOLOv5 (You Only Live Once v5) algorithm to identify fires and added three convolutional attention modules to the algorithm to improve the key feature extraction capability; at the same time, the C2f module was used to replace the original C2 module to obtain more information. Dou et al. [
15], using the convolutional attention module, optimized YOLOv5, substituting BiFPN for Panet and transposition convolution for neighbor interpolation and ulteriorly used MobileNet V3, ShuffleNet V2, and GhostNet to improve the performance of the YOLOv5 model. Mondal et al. [
16] proposed an integrated device of real-time fire detection and automatic fire extinguishing based on computer vision and developed a unique local fire location technology for fire location combined with fire color characteristics. Shen et al. [
17] introduced an improved adaptive lightweight FireViT fire identification method based on MobileViT. To better adapt to the irregular changes in smoke and flame in fire scenarios, deformable convolution and an improved adaptive activation function were introduced in their study to improve the performance of the network model. Ko et al. [
18] combined the current frame of a video with the corresponding block in the previous frame to determine whether the phenomenon with flow characteristics is smoke and used it for fire warning. Tomoaki et al. [
19] proposed a Bayes–Poisson regression analysis method to study the relationship between fire probability and ground motion intensity during seismic events in Japan from 1995 to 2022. The evaluation indexes of earthquake intensity are peak ground acceleration, peak ground velocity, and Japan Meteorological Agency earthquake intensity. Lu et al. [
20] proposed a physical model-based on-site fire spread simulation and smoke visualization method for use after an earthquake. In this study, a fire dynamics simulator was used to build a city-scale fire scene after an earthquake.
In terms of explosive detection, Baiyi Zu et al. [
21], based on the recent progress on nanostructured vapor phase explosive gas sensors that operate in dark conditions, highlighted the attractiveness of developing optoelectronic sensors for vapor phase explosive detection and proposed employing photocatalysis principles to enhance the sensitivity. Xuan He et al. [
22] used a simple and efficient self-approach strategy to apply ultrasensitivity and self-revive ZnO–Ag hybrid surface-enhanced Raman scattering sensors for the highly sensitive and selective detection of explosive TNT in both solution and vapor conditions. Manvinder Sharma et al. [
23] studied explosive detection methods, standoff spectroscopy-based methods, and LIBS and compared the existing methods of trace detection, providing a review of the world’s smallest drone made by Israel, which can detect explosives and drugs from a distance of 2.8 km. In terms of chemical leakage, because chemicals are often toxic or corrosive, the method of monitoring chemicals is relatively complex, and it is usually carried out after an accident according to national standards and the scientific disposal process to further identify the various types. It is often used as a targeted reference for subsequent environmental management. It is difficult for existing technical means to achieve timely monitoring and early warning at the scene of secondary disasters [
10,
24,
25]. In terms of dangerous building detection, Lu et al. [
26] built a solid wall, applied horizontal shear force onto the surface to make the wall deform and lean, simulated the static process in the wall when an earthquake occurred, simulated the acceleration outside the wall in the process of wall collapse using the experimental method, and studied the triggering boundary conditions of the fall of external protective components. This study provided an empirical model for the falling of wall attachments. Ji et al. [
27] studied the seismic performance of glass walls under different loads during earthquake occurrence, adopted quasi-static and dynamic in-plane loading methods to simulate the load boundary conditions of glass walls during earthquake occurrence, and carried out experimental research in a homemade full-size all-tempered insulating glass curtain wall system. The experimental results show that the stress concentration at the diagonal part of the glass contact with the frame can lead to the breakage of the glass plate. Xu et al. [
28] proposed to adopt the concentrated mass shear model of multi-story structures and specific criteria for the falling of external non-structural parts. Based on the uncertainty of the earthquake and vibration process, the incremental dynamic analysis method was used to predict the distribution probability of falling objects during the design life of building clusters, and the urban seismic elastic–plastic analysis method was used to obtain the floor velocity of the floor where falling objects landed. Based on this, the distribution law of falling objects landing is obtained by jointly using the model of flat throwing motion. In terms of injured personnel rescue, Clara Obregón et al. [
29] highlighted the social systems that enable community-to-community support as well as potential opportunities for providing external aid to support communities more efficiently. They suggested that community-to-community support is critical in the first weeks after a disaster and highlighted the fact that the roles that different support networks play at different stages of disaster response are critical not only to improve people’s and institutions’ abilities to recover from particular disruptions but also in broader efforts to strengthen community resilience in the face of climate change. Rui Gao et al. [
30], in relation to a small-area deployment scenario, proposed a small-area UAV deployment to improve the Broyden–Fletcher–Goldfarb–Shanno algorithm via improving the iterative step size and search direction to solve the high computational complexity of the traditional Broyden–Fletcher–Goldfarb–Shanno algorithm. In a large-area deployment scenario, to address the problem of the premature convergence of the standard genetic algorithm, the large-area UAV deployment elitist strategy genetic algorithm was proposed through the improvement of selection, crossover, and mutation operations. Sanjoy Debnath et al.’s [
31] review provided a comprehensive survey of the widely used communication technologies applied for setting up an emergency communication network to mitigate the post-disaster aftermath; their review also delivered an overview of the integration of new technologies with the existing standards for improving the performance of the disaster communication networks. Finally, they proposed some promising solutions to overcome the limitations of existing emergency communication technologies to improve the overall network performance. The situational knowledge metadata included information about event characterization characteristics, emergency prevention preparation, and the time, resource, information, and business continuity constraints of event disposal. The scenario construction of electric power emergency communication support for natural disaster sites was realized with time and space decomposition [
32,
33].
For the monitoring and early warning of the types of secondary disaster factors, the current research mostly adopts a combination or independent form of basic theory, experiment, and simulation [
34,
35]. At present, there is a clear demand for monitoring and early warning of disaster factors, but the technologies and methods for the specific implementation process are too scattered, among which the identification of flammable objects, explosive objects, toxic substances, trapped persons, and building collapse are all at the level of mechanism analysis, single technology realization, model analysis, and other physical phenomena application and interpretation, which are difficult to use for monitoring before secondary disasters.
In terms of fire identification, with the participation of advanced algorithms such as artificial intelligence and deep learning, relevant scenes are identified [
36]. Similar to flammable objects, explosive objects, toxic substances, and trapped persons, dangerous buildings also have clear physical sources. Visual characteristics of physical objects are obtained by constructing the correlation between physical objects and disaster event types. Using deep learning algorithms to identify phenomena and build early warning strategies for secondary disaster events is an important task that needs to be carried out in step with the current development level of science and technology.
3. Materials and Methods
3.1. Construction of Disaster Factor Dataset
After natural disasters, the main causes of secondary disaster events include fires, electric bicycles, flammable gas storage tanks, strong alkali, acid diluents, pesticides, people trapped in dangerous environments in need of rescue, damaged or cracked buildings that may collapse, etc. In general, there are six types of secondary disaster sources. In this study, these sources are divided into two types: category I includes fire, and category II includes flammable objects, explosive objects, toxic substances, trapped personnel, and dangerous buildings, as shown in
Figure 1.
Because the luminous phenomena generated by fires are easily confused with the four types of similar non-fire phenomena in life, such as sunsets, welding, strong light, and weak light, to distinguish them from bright characteristics of fires, this study combined these four phenomena with fire to constitute the category I secondary disaster factor dataset. Fire is ‘factor 1’ in the dataset (it consists of pre-fire smoke, field fire, building fire, and indoor fire images), and the other four phenomena do not cause the occurrence of secondary disaster events and are only used in the dataset to assist the exclusive extraction of features of fire phenomena using the deep learning model.
The category II secondary disaster factors include flammable objects (consisting of electric bicycles and battery images); explosive objects (consisting of liquefied petroleum gas, acetylene, and oxygen tanks that are high-pressure combustibles, and combustion-supporting gas images); toxic substances (consisting of corrosion-resistant plastic drums, strong alkali, strong acid, and pesticides); ‘dangerous chemicals’ (warning signs are added to the images to enhance the Convolutional Neural Network’s ability to learn complex features); trapped persons (consisting of images of adults and children); and dangerous buildings (consisting of characteristic images of cracks in buildings that can cause them to collapse). Among these, flammable objects are disaster ‘factor 2’, explosive objects are disaster ‘factor 3’, toxic substances are disaster ‘factor 4’, trapped persons are disaster ‘factor 5’, and dangerous buildings are disaster ‘factor 6’.
In this study, to better extract the characteristics of relevant factors and distinguish them from common disaster factor analogs in everyday life, five object analogs, namely, bicycles, iron containers, garbage cans, animals, and brick cracks, were set to correspond with disaster factors 2, 3, 4, 5, and 6, respectively, to assist in training the deep learning model. The dataset contains secondary disaster factors, and similar scene objects are 15 classes used for feature extraction of secondary disaster factors, each class consisting of 1000 images, comprising 800 in the training set, 100 in the verification set, and 100 in the test set.
3.2. CNN Principles of Mathematics
As an important method of image feature deep learning [
37], a CNN (Convolutional Neural Network) uses convolution kernel and pooling to obtain the object’s feature information from the image. In this study, the CNN is used to learn the visual features of secondary disaster factors in various types of images, as shown in
Figure 1. Taking fire as an example, the fire identification based on the CNN is shown in
Figure 2.
The CNN realizes the extraction of associated information between adjacent pixels from the image through a convolution kernel, and its principle can be expressed as follows:
In Formulas (1) and (2), j and k are the feature vector’s j row and k column, w is the weight, i is the number of feature vectors of neurons in the next layer, s is the number of feature vectors in the previous layer, m and n are the values of the convolution kernel, b is the bias, z is the input of neurons in this layer, a is the output of neurons in this layer, and l is the layer l.
The detailed features obtained after the convolutional calculation contain a large amount of redundant information; therefore, to simplify the calculation, the redundant information is filtered in a pooling kernel to reduce the number of parameters, and the pooling method is used to reduce the parameters by
n2 times through the
n ×
n pooling kernel. In this study, the average pooling kernel is used to extract the features; its mathematical principles can be expressed as follows:
In Formula (3), ap is the output of the pooling layer, a is the output of neurons in this layer, d is the step size of pooling, m and n represent the values of the pooled kernel, j and k represent the j row and k column of its eigenvector, m and n represent the convolution kernel, and l is the layer l.
The function of the linear layer is to classify the features extracted from the convolution and pooling process. It is a standard neural network. At the end of the linear layer, the number is designed according to the number of input layer class types. In the training, the weight of the whole network is obtained through multiple iterations using the form of truth tag assistance combined with the gradient descent method. As the weight is obtained based on feature-supervised learning, the model training process is completed. When the input obtains the test dataset’s sample, the unknown sample can be judged according to the value of the output node of the linear layer, and the ‘softmax’ mathematical calculation process can be expressed as
where
z is the input prediction array, and
m is the total number of classes. The decision process of softmax function in multi-classification tasks is shown in Formula (5):
where
m is the total number of classes,
xi is the samples to be identified,
yi is the label corresponding to the samples, and
θTxi is the decision boundary condition of the classification.
In training the network process, the ‘one-hot’ label is combined with the gradient descent method to map the class of samples from the input layer. In this study, the one-hot labels for sunset, welding, strong light, weak light, and fire, according to the order of the corresponding neural network output of 1–5, are [1, 0, 0, 0, 0], [0, 1, 0, 0, 0], [0, 0, 1, 0], [0, 0, 0, 1, 0], and [0, 0, 0, 0, 1], respectively, and the CNN weight calculation process is as follows:
where
S is the value obtained after subtracting the label value from the output node value of the linear layer,
j is the
j class in the five classes,
k is the non-
j class in the five classes, and ∇ is the gradient descent method. When the model obtained through the training dataset is used to identify the unknown physical scene picture (from the test dataset) information, the linear layer labels in this study are [sunset, welding, strong light, weak light, fire]
T.
3.3. Image Classification Using CDMV Method Based on CNN
The CNN model learns the features of the secondary disaster factors using the training set. The traditional strategy is to divide the proportion of each class in the training set so that the different classes will have equal learning opportunities and the CNN model will obtain a high mAP in the test set. If a certain class’s proportion in the training set is increased, the CNN will have more opportunities to learn the characteristics of this class. To improve the learning opportunity of a single class of the CNN model so that it can improve this class’s AP, this study proposes a ‘Class Decision making by Models Vote (CDMV)’ calculation method. The implementation process is as follows:
Step 1: A CNN model has the highest AP value for one class; it is obtained by increasing the feature learning opportunity in the training set, and each class trains a similar CNN model.
Step 2: Multiple CNN single-class recognition advantage models obtained in step 1 are combined. A CNN model only outputs ‘softmax’ results of one class with recognition advantage and obtains multiple probability reference values.
Step 3: Multiple probability reference values are compared, and the maximum value is taken as the final CDMV decision result.
The advantage of the CNN-CDMV algorithm is that under the same CNN algorithm (AlexNet, GoogleNet, ResNet-50, VGG, EfficientNet, MobileNet, ShuffleNet, etc.), the mAP obtained with CNN-CDMV is higher than that obtained using a single CNN model, and its mathematical principle can be expressed as follows:
where
Pki is the AP of the
k class of the
i model,
Pi is the mAP of the
i model,
n is a model with a total of
n classes and
N models (
N =
n), and
Pb is the mAP under the CNN-CDMV algorithm.
Formula (7) is used to obtain
n models based on
N classes, Formula (8) is used to take the CNN model with the strongest recognition performance of a single class to represent the recognition performance of this class, and Formula (9) is used to obtain the total classification accuracy
Pb under CDMV. Because a single CNN model has cognitive bias in recognition performance of different classes, the maximum AP value of
N classes in
n models cannot appear in the same CNN model; therefore, the total classification accuracy of CNN-CDMV is higher than that of any single CNN model (
Pb >
Pk), and its principle is shown in Formula (11). Generalized to any number of classes, with the increase in classes in the CNN model recognition task, the mAP obtained based on the CNN-CDMV algorithm is still higher than that of a single CNN model. This process can be described as follows:
In Formula (11), n is the number of CNN models (there are n models when there are n classes), Pn is the mAP of the n models, the classification accuracy of each corresponding subtype is [P1i, P2i,⋯, Pki,⋯, Pnn], Pb is the mAP obtained with the CNN-CDMV algorithm, and Pk is the mAP obtained for any one of the CNN models.
Assuming that among
n CNN models the optimal classification of type
i comes from model
i (
Pii), the mAP obtained with the CNN-CDMV algorithm is higher than that obtained by a single CNN model (∆
P ≥ 0) when there are many arbitrarily classified objects. The model acquisition process of the CNN-CDMV algorithm in Algorithm 1 (based on ResNet-50 [
12] training) for fire identification is as follows:
Algorithm 1. Model acquisition process of the CNN-CDMV algorithm |
Model-i Input: Dataset D, split Initialization: € = load CNN pretrained model,
- Step 1.
Split:[train validation test] = [unbalanced divided, 100 pictures, 100 pictures], training_parameters = [epoch = 30, learning rate = 0.001, batch size = 32, validation interval (in epochs) = 1, solver type = Adam] - Step 2.
[α, β, γ] = prepare_data (D, split) α = random (D, Datatrain) β = random (D, Dataval) γ = random (D, Datatest) return [α, β, γ] - Step 3.
Ω = €(α, β, training_parameters) - Step 4.
[TP, FP, FN, TN] = Ω(γ) - Step 5.
[TP, FP, FN, TN] → [class1: Pre = P1, class2: Pre = P2, class3: Pre = P3, class4: Pre = P4, class5: Pre = P5] - Step 6.
Pi = max{P1, P2, P3, P4, P5}, i∈[1, 2, 3, 4, 5], - Step 7.
Pi → Model-i prepared - Step 8.
Repeat step1-step8 → [Model-1, Model-2, Model-3, Model-4, Model-5] → Parallel structure model - Step 9.
Ω = € (Parallel structure model: γp) - Step 10.
Output: Ap = Ω (γp)
|
α is training data, β is testing data, γ is validation data, Ω is trained model, D is dataset, A is accuracy, TP is true positive, FP is false positive, FN is false negative, TN is true negative, Pre is precision, Pj is probability, γp is parallel structure validation data, and Ap is parallel structure decision layer accuracy. |
3.4. Multi-Model Cluster Decision of Disaster Factors at Disaster Sites
In daily life, the main common physical objects that can cause secondary disaster events in disaster sites include category I fire and category II combustible objects, explosive objects, toxic substances, trapped persons, and dangerous buildings. Based on the high-precision requirement for secondary disaster event monitoring and early warning, our CNN-CDMV method was used to learn the visual features of secondary disaster factors. In the process, an unbalanced training set should be divided, and a single CNN model has advantages in the identification of a single secondary disaster factor. The training set of each single CNN model is divided as shown in
Table 1.
As shown in
Table 1, the fire dataset of category I secondary disaster factors includes five classes: sunset, welding, strong light, weak light, and fire. The four categories of sunset, welding, strong light, and weak light have no relation to disaster events, but they are highly similar to fire. The training set added these four categories together with fire phenomena to reduce the influence on fire identification in practice, indirectly improving the AP value of fire warning. Category II secondary disaster factors are flammable objects, explosive objects, toxic substances, trapped persons, and dangerous buildings. In the process of CNN-CDMV method identification, to minimize the influence of similar everyday objects on these five classes, five similar objects, namely, bicycles, iron containers, garbage cans, animals, and brick cracks, were added to the training set. Their proportion was 15% each in the training set with the disaster factor’s proportion being 25%. Five unbalanced training sets were constructed according to category II secondary disaster factors. Training the advantage model on the identification of flammable objects, explosive objects, toxic substances, trapped person, and dangerous buildings was based on the CNN. There are ten CNN models for categories I and II, which together constitute the core of the CNN-CDMV decision method. The construction method is shown in
Figure 3.
In
Figure 3, the CNN-CDMV core is composed of ten CNN models: sunset, industrial fire, strong light, weak light, fire, combustible objects, explosive objects, toxic substances, trapped persons, and dangerous buildings. These ten models were used to identify the objects that could easily lead to secondary disasters. In the identification process of category I of the secondary disaster factor model, the visual information will produce the corresponding ‘softmax’ result after passing through the model. The decision process shown in
Figure 2 can finally determine whether there is a fire in the scene. If the result is a fire, the ‘softmax’ result of the fire is output. In the category II disaster identification model, if the model determines that there is a corresponding disaster factor, the ‘softmax’ results of the disaster factor are also output. In the same scene, CNN-CDMV can output ‘softmax’ results of six disaster factors, namely, fire, flammable objects, explosive objects, toxic substances, trapped persons, and dangerous buildings. The results will be used for secondary disaster event identification and early warning.
5. Discussion
Currently, the main methods that can be used for image information recognition are image object detection and classification. The content recognition of images using the two methods is shown in
Figure 6.
Figure 6A is the recognition of the on-site image of secondary disaster events using the object detection method, (B) is an on-site image of disaster events, and (C) is the recognition of the on-site image of secondary disaster events with the classification method. The disaster events in the image contain four objects: cracks in buildings, trapped persons, flammable objects, and explosive objects. The object detection model in (A) can identify the four objects at the same time, while the image classification model in (C) can only recognize one object class from the image; that is, only one class in the image can be identified, so it requires four classification models to identify all four objects in the image. In this study, the classification method is used to construct a CDMV algorithm through multiple models to classify multiple objects in the image to realize the monitoring and early warning of possible secondary disaster events at the disaster on-site monitoring points.
In this study, the principle of CDMV can improve the accuracy of image information recognition as a single CNN model has a cognitive bias toward different objects. Taking the ‘VOC2007 + 2012’ dataset as an example, there are significant differences in the ability of a CNN model to recognize different objects, among which, for SSD (Backbone is VGG-16 neural network), the AP of ‘bottle’ and ‘plant’ was 50.7% and 50.1%, and the AP of ‘cat’ and ‘horse’ was 85.8% and 86.9%, respectively [
43]. The DSSD (Backbone is ResNet-101 neural network) algorithm has 52.4% and 54.5% AP for ‘bottle’ and ‘plant’ and 84.3% and 88.5% AP for ‘cat’ and ‘horse’, respectively [
44]. The FD-SSD (Backbone is VGG-16) algorithm has 54.1% and 53.9% AP for ‘bottle’ and ‘plant’ and 86.8% and 88.0% AP for ‘cat’ and ‘horse’, respectively. The same is true for Faster R-CNN (Backbone is ResNet-101) [
38], R-FCN (Backbone is ResNet-50) [
45], YOLO (Backbone is GoogleNet) [
46], YOLOv5 (Backbone is DarkNet-19) [
47], DSOD (Backbone is DS/64-192-48-1) [
48], DF-SSD (Backbone is DenseNet-S-32-1) [
49], etc. In addition, to better identify certain types of features in images, relevant scholars proposed the YOLO hybrid attention mechanism method. Such as NAM-YOLOv7 [
50], HAM-YOLOv5 [
51], and EfficientNet-YOLOv5 [
52], these methods significantly improve the CNN recognition performance of a certain feature in the classes. To improve the recognition accuracy of all classes, feasible mechanisms were also studied based on the characteristics of the CNN itself. In the case of the same proportion of training sets, different models vary in the feature learning ability of a single class, and this limitation of the CNN model exists in the process of learning the features of the whole sample. Due to the large number of weight parameters of the models and their complex compositions, it is difficult to describe the formation mechanism of cognitive bias [
53]. To identify the disaster factors at a disaster site, the recognition accuracy of a single class is demanding. Therefore, it is of great significance to overcome the cognitive bias between different classes of the single CNN model in multiple classification tasks and ensure that the overall recognition accuracy of the model is high so that the recognition accuracy of a single class can achieve the best state and the monitoring and early warning work of disaster events can be carried out accurately.
There are two types of image information recognition accuracies that are improved by the CDMV algorithm in this study, for example, the ‘softmax’ results of the ResNet-50 and ResNet-CDMV model are less than 0.5. As shown in
Figure 7, the two models identify the scene class contained in image 7 in
Table 2. The real scene in the image is fire. The ‘softmax’ probability reference values of a single ResNet-50 output for ① sunset, ② welding, ③ strong light, ④ weak light, and ⑤ fire scenarios are ① 0.533388, ② 0.001140279, ③ 0.000000185, ④ 0.000195662, and ⑤ 0.46527588, respectively. The value of sunset is the largest; so, the ResNet-50 model considers the scene contained in image 7 to be sunset. The ⑤ fire model in ResNet-CDMV gives the ‘softmax’ probability reference value of the ⑤ fire in image 7 as 0.47225126, and only this value is output. Moreover, ① 0.516423, ② 0.00204462, ③ 0.001644969, and ④ 0.00763621 are the ‘softmax’ probability reference values of ① sunset, ② welding, ③ strong light, and ④ weak light, respectively, and do not output. The ① sunset model, ② welding model, ③ strong light model, and ④ weak light model only output ‘softmax’ probability reference values for ① sunset, ② welding, ③ strong light, and ④ weak light recognition, which are ① 0.31958973, ② 0.002117465, ③ 0.029371442, and ④ 0.0022207, respectively. They are normalized together with the ‘softmax’ probability reference value of the fire at the CDMV decision level to obtain ① 0.38712312, ② 0.00256491, ③ 0.03557802, ④ 0.00268996, and ⑤ 0.57204399. The normalized ‘softmax’ probability reference value of the fire is the largest. Therefore, the ResNet-CDMV model confirms that image 7 is the correct judgment of the ⑤ fire, and images 1, 3, and 12 in
Table 2 show the same situation.
The probability reference values of output ‘softmax’ after ResNet-50 and ResNet-CDMV model identification are all less than 0.5, as shown in image 2 in
Table 2; the real scene in this image is welding. The ResNet-50 model ‘softmax’ result values of ① sunset, ② welding, ③ strong light, ④ weak light, and ⑤ fire are 0.012486734, 0.32064572, 0.06004316, 0.19397007, and 0.41285425, respectively, in which the value of fire is the largest; therefore, the ResNet-50 model misidentifies the welding scene as the fire scene. The ResNet-CDMV model ‘softmax’ result values of ① sunset, ② welding, ③ strong light, ④ weak light, and ⑤ fire are 0.027238317, 0.66015106, 0.27432773, 0.3358008, and 0.35803366, respectively, in which the welding value is the largest. The ResNet-CDMV model correctly identifies the information contained in the image, and the same situation is shown in images 4, 5, 6, 8, 9, 10, and 11 in
Table 2.
The ResNet-50 algorithm can only understand a single piece of information contained in the image, and the CDMV method proposed in this study simultaneously integrates multiple ResNet-50 models of different purposes to realize the recognition of multiple objects in an image. From the perspective of algorithm efficiency, compared with the multi-target monitoring algorithm, there are two disadvantages and one advantage. The first disadvantage is that the position of the disaster factor in the picture cannot be marked directly. The second disadvantage is that the mAP result of the ResNet-CDMV model is obtained through the calculation of 10 ResNet-50 models, and the time cost is 10 times that of a single ResNet-50 model. The advantage of this algorithm is that it overcomes the cognitive bias of a single model to multiple objects and has a higher mAP. The location of the detection object in the field environment is determined, and the algorithm only needs to give early warning to the types of secondary disasters existing in the scene information. Additionally, based on the importance of secondary disaster events, the significance of the recognition accuracy is greater than the time consumed. Therefore, the shortcomings of the CDMV method will not have a negative impact on the early warning effect. In addition, since the emergency handling of secondary disasters involves the allocation of emergency resources, there are strict requirements for high-precision identification and early warning. Therefore, the advantage of the CDMV method has important practical significance for secondary disaster monitoring and early warning. In further research, the multi-model parallel decision making can be integrated into a more complex model, and the nodes of the neural network can be divided into regions, so that the memory units in a certain area can focus on recognizing a certain type of feature, and a multi-classification recognition model with higher accuracy can be formed under multi-region cluster decision making.
6. Conclusions
To investigate the types of secondary disasters that can be caused by common everyday objects after natural disasters, this study constructed image datasets of six secondary disaster factors—fire, inflammable objects, explosive objects, toxic substances, trapped persons, and dangerous buildings—extracted visual features of secondary disaster factors through the ResNet-CDMV algorithm, and correlated them with corresponding secondary disaster events. In terms of improving the early warning accuracy of secondary disaster events, this study makes two contributions:
In addition to the secondary disaster factors, the dataset includes common non-secondary disaster factor items in everyday life that are similar to the visual characteristics of the secondary disaster factors. Their existence can effectively improve the identification accuracy of the secondary disaster factors.
The CDMV kernel comprises a combination of multiple dominant models, and this overcomes the cognitive bias of a single CNN model for different types. This method can effectively improve the identification accuracy of secondary disaster factors in the ‘softmax’ layer.
The ResNet-CDMV method obtained in this study has an mAP value of 87% for the identification of secondary disaster factors. For this method, Faster-RCNN, SSD, CornerNet, CenterNet, and YOLOv7 object detection algorithm increased by 9.33%, 11.83%, 13%, 11%, and 8.17%, respectively. The high-precision early warning platform of secondary disaster events based on the ResNet-CDMV method has important practical significance for the prevention and management of secondary disasters after the occurrence of natural disasters.