1. Introduction
Industrial safety is an important guarantee technology for industrial upgrading to fully automated production of intelligent systems. Industrial accidents have their own characteristics; accidents are relatively rare compared with the whole life cycle of production, and the scope of consequences is large [
1,
2]. The characteristics of equipment reliability are not always the determining factors when assessing industrial safety risks; the industrial safety of complex processes is influenced by both external and internal factors [
3,
4]. During the production process, the geographical location, the quality of the raw materials, the weather conditions, unsafe conditions of the environment, the technological process, unsafe operation of personnel, the unsafe state of objects, and many other factors can cause negative phenomena. However, most of the raw materials and products of chemical enterprises are in flammable, explosive, toxic, and harmful states. These dangerous goods easily cause casualties and property losses in accidents [
5,
6]. Early detection of these factors is important to take timely preventive measures to prevent the occurrence of harm. To study the abnormal behavior of complex process systems, the mathematical modeling method is usually used, and the problem of studying various processes is simplified to the problem of studying the properties of mathematical models. Thus, the risk early warning model for the production process of chemical enterprises is established, and the massive data accumulated over the years in the chemical industry are used for risk early warning. Therefore, how to incorporate environmental factors in the production process into the safety assessment and provide safety-security assisted decision support for chemical enterprise managers has become an urgent problem to be solved in the safety-security field of intelligent factories.
Most of the existing factory risk assessment methods use index or probability evaluation [
7,
8,
9]. The adoption of the index makes the system structure complex, and it is difficult to express that the evaluation of the risk unit has a feasible method using the probability. The evaluation method focuses on different production links and aspects that are not the same, and it is difficult to form a complete system. At the same time, although single factor and single index evaluation are widely used in a short time due to their simplicity and convenience, there are some shortcomings, such as insufficient comprehensive information and easy distortion, which cannot meet the increasingly strict and accurate requirements of chemical factory risk evaluation. In recent years, it has gradually developed from single factor and single index analysis to systematic comprehensive evaluation and examination direction. For the evaluation of complex dangerous environments, the most commonly used evaluation methods mainly include the comprehensive index method [
10], set pair analysis method [
11], fuzzy comprehensive evaluation method [
12], neural network method [
13], gray theory comprehensive evaluation method [
14], and ICI Mond evaluation method [
15]. The literature [
7] established a security system dynamics model based on the methods of system dynamics. The model reflects the complex relationship among the safety factors of the storage and distribution station. The methods of system dynamics are used to model and analyze the relationship among the safety influencing factors. The literature [
10] established the index system for coal mine safety assessment according to the influencing factors of coal mine safety: a comprehensive safety evaluation model for coal mine is built on the basis of analytic hierarchy process, and the effective safe management measures are proposed. The literature [
11] built an evaluation and prediction model for occupational hazards in coal mines based on set pair analysis. The model uses three aspects of identity–discrepancy–contrast to study the relationship between uncertainty and certainty of a factor or event. The literature [
12] established a quantitative risk assessment model of the third party damage based on analytic hierarchy process and fuzzy comprehensive evaluation. The weight of factors could be determined by improving the analytic hierarchy process, and the importance of each factor is calculated by fuzzy comprehensive evaluation model. The literature [
13] proposed a 5M safety model by combination with the characteristics of the rail transit safety assessment, including complexity, dynamic, ambiguity, etc. The neural network method is used to dynamically evaluate railway safety. The application of gray numbers will improve the ability of decision-making models to respond to the ambiguity that arises from having incomplete information. The GM (1,1) model is constructed by using annual datasets of work-related deaths from five branches: mining and commercial casualties, highway traffic accidents, railway traffic accidents, fire disasters, and all fatal casualties [
14]. The safety assessment of a waste incineration power plant is carried out by using Imperial Chemical Company (ICI, Kaohsiung, Taiwan) Mond fire and explosion index evaluation of toxicity. The difference between before and after compensation of the total risk coefficient R of different units in the plant is obtained [
15]. Zhang et al. analyzed the evaluation method of the safety degree of chemical enterprises. A risk evaluation model of the coal chemical production process is established based on fuzzy comprehensive evaluation theory. It provides a theoretical basis for the safety production analysis of such coal chemical enterprises and puts forward practical suggestions for preventing accidents that may occur in the process of production [
16]. Orsoni A. combined simulation and fuzzy logic techniques, considered the domino effect of possible unexpected events, and conducted a systematic risk assessment of the design and layout configuration specified for the plant handling hazardous substances. The design schemes are evaluated and compared quantitatively by the fuzzy method [
17]. A mathematical model of the styrene production process function was established by using neural network technology. Based on the prediction results, some suggestions are put forward for the industrial safety assessment of special hazardous production processes [
18]. Bozzano M. et al. combined system design activities with safety assessment and the methods used to help safety engineers realize the automation of some working stages to maintain an adequate level of safety [
19]. Alanen J. et al. established a network security risk analysis method for industrial control systems. Based on the hybrid risk assessment ontology, security risk assessment management is carried out, and the method is successfully demonstrated [
20]. However, most of the existing methods are greatly affected by subjective factors, such as the fuzzy comprehensive evaluation method [
21,
22]. Some are dependent on the quality and quantity of sample data, and the physical significance is not clear enough, such as the neural network method [
23,
24], and some are too complex to be calculated and go against popularization and application, such as the comprehensive index method. The most important step in the traditional evaluation method is determining the evaluation index and its weight; mostly, a single weighting method is used. The subjective weighting method is subjectively influenced by decision-makers and lacks objectivity. The objective weighting method ignores the difference in the importance of indicators to evaluation objects [
25]. Therefore, we consider that the importance degree of different indicators is different in the actual evaluation work, and the evaluation of the importance of indicators will be affected by the subjective willingness of decision-makers. Meanwhile, the data collected by the production monitoring system are certain values and have some errors. These values are difficult to effectively describe the fuzzy, complex, and uncertain accident patterns of the system, and the states of these patterns usually fluctuate within a range.
To the best of our knowledge, there is no research on the impact of using extension engineering on the safety-security evaluation of the factory and the production environment of the workshop on personnel work [
26,
27,
28,
29,
30]. Extension data mining uses an extension set as the basis of set theory, combined with extension methods and existing data mining methods, to mine knowledge based on extension transformation in a database or data warehouse, and the bases for decision-making and technical innovation in the economy, finance, management, marketing, planning, medicine, design, and other fields are provided [
31,
32]. Extenics is a new discipline established by scholars led by Cai Wen, Yang Chunyan, etc. It uses formal models to study the possibility of things’ expansion and the rules and methods of exploration and innovation, and it is also used to solve contradictory problems [
33,
34,
35]. The core of extenics is transforming contradictory problems into compatible problems, and the key of extenics is determining the weight coefficient of the evaluation index. There are many contradictory problems in the real world, as in the production process of chemical enterprises as the chemical workshop wants to achieve higher production in the safest state. However, the production of more finished products requires the workshop and personnel to bear a large load, which can easily amplify the potential safety hazards. It also means that the warehouse needs to store a larger amount of finished products, and the possibility of damage is increased [
36,
37,
38,
39,
40]. At the same time, due to the particularity and individual differences of actual production operations and working conditions, the production safety influencing factors of different workshops are not the same, so the parameter selection of workshop safety level evaluation is also different. In the specific production process, the parameter selection of safety factors and the evaluation index weight coefficient need to be determined by more objective and scientific methods. In this paper, the weight coefficient determined by the establishment of an analytic digital model is completely transformed into field monitoring data.
The safety evaluation of chemical plant (SECP) model proposed in this paper is an extension mathematical model driven by the real-time monitoring data and outdoor meteorological data of the production operation in the factory. It is used to solve the contradictions of production requirements and safety prevention and conduct regular safety evaluation of the production safety prevention in the chemical factory. In chemical plant security evaluation, data integrity, correctness, and consistency caused by measurement, calculation, packet loss, and human factors in the actual data input and collection process are greatly affected by data errors, incomplete information, and human subjective factors. Since hardware error cannot be avoided, it is difficult for the data measured in the industrial field to be an accurate value. It is a common phenomenon that the time series data collected by the on-site monitoring system of the enterprise have local fluctuations, and such fluctuations are often random. In terms of data collection, this may lead to data deviation, resulting in the distortion of the data in terms of reflecting the real world. That is, there is an uncertainty difference in the process of characterizing real-world production systems by using data, and this uncertainty will reduce the accuracy of the final safety evaluation results. This paper uses the chemical factory safety extension prerisk model to assess the process risk of chemical production systems. The uncertainty of the data can be well described by converting the observed values into interval data. The risk assessment of the production process using extension theory can relate the degree of risk to each influencing factor (including production factors and environmental factors). Through the multi-level index division, the risk degree of the joint action of the factory production equipment and the surrounding environment can be assessed to achieve the purpose of early warning. This paper first comprehensively analyzes and evaluates the data quality. It has good objectivity, fairness, and interpretability. The key point is selecting suitable examination evaluation index dimensions according to the application scenes. Second, because the sensor has certain errors in collecting data, the certainty numerical data are converted into uncertainty interval data to make the later training results more accurate, and the index weight is determined by the game theory comprehensive weighting method. Finally, the safety comprehensive evaluation is carried out by the uncertainty elementary dependent function in two nested regions.
The rest of the paper is organized as follows:
Section 2 discusses the primitive representation of safety-security data of chemical enterprises.
Section 3 provides a chemical factory safety extension prerisk model and discusses the details of the model.
Section 4 presents the evaluation and analysis of safety and environmental data of each workshop in chemical enterprises.
Section 5 discusses the evaluation results.
Section 6 concludes the paper and discusses possible future works.
2. Primitive Representation of Safety-Security Data of Chemical Enterprises
SECP is a process of systematically collecting information according to the requirements of production objectives and safety principles, and the safety level of process monitoring and the production environment in the process of production operations are judged and evaluated. SECP includes workshop safety evaluation, hazard source risk assessment, toxicant analysis, employee self-assessment, and accident consequence simulation analysis. As real-time data monitoring of gas concentration in workshops and data collection of indoor and outdoor environment information are the main methods to obtain safety-security feedback information, they are also important ways to check the safety level and evaluate production safety and the environment. In this paper, the analysis and mining of the safety and environmental data of chemical industry (SEDCI) mainly aimed at real-time gas concentration data. The outdoor meteorological index data are used to assist in determining the daily risk and the suspected hazard source location, and then the workshop working condition data (noise, light, etc.) are combined for analysis, the value implied by these data is fully mined to further improve workshop safety and work quality. The following is the preparation stage of “workshop safety and environment evaluation data” and “factory outdoor meteorological data”, which are expressed in primitive form and divided into two steps: data selection and data preprocessing.
2.1. Data Selection
SECP indicators should be selected according to the actual production operation environment. Based on the results of field investigation, it is found that the working environment of the factory is complex, and the dangerous sources and dangerous situations are complex. Through investigation, screening, and confirmation with the field engineer, the indexes with smaller influencing factors are removed. Meanwhile, according to the following principles: (1) Principles of practicality and representativeness: practicability is the first principle and basic premise of constructing the early warning index system. There are many factors that cause accidents in chemical plants, and the problems involved are also very wide. People often choose many evaluation indicators in order to achieve a more comprehensive and accurate description of the research object. However, there are many factors that cannot reflect specific problems well. These indicators not only have great difficulties in quantification and operation but also may affect the accuracy of the evaluation results. Therefore, when establishing the early warning index system, it should have a certain representativeness so as to make the established early warning index more concise and easy to operate. In this paper, the indicators are screened according to the opinions of experts in the field, and a simple and easy-to-operate evaluation index system is established. (2) Principles of scientificity and systematicness: the indicators should be based on the prior knowledge of the field, and it should be able to objectively reflect the various factors that affect the production of chemical products of an enterprise and their interrelationships so as to accurately reveal the safety status of chemical enterprises. Early warning management involves all aspects of the safety management of chemical enterprises. The accuracy of early warning can be ensured only by comprehensively integrating and analyzing various risk factors and using a number of quantitative indicators to predict the risk degree of chemical production. (3) The principle of combining qualitative and quantitative aspects: quantitative indicators can reduce the influence of subjective factors and try to make the early warning objective and real. Due to the influencing factors being complex, it is sometimes difficult to accurately describe with numbers. Especially when the data are insufficient, qualitative indicators are particularly important, which can be used to identify a certain development trend of the stage. Therefore, it is necessary to consider these two indicators in the early warning. The following parameters are selected as the evaluation basis in this paper, as shown in
Figure 1.
Based on the basic indicators of the workshop, the thermal comfort index and visible light environment are introduced to evaluate the influence of the working environment on the working state of employees in the workshop. Among them, the thermal comfort index is introduced to evaluate the thermal environment, and the thermal comfort index refers to the comprehensive reflection of various factors of the human body on the thermal environment. This paper selects the predicted mean vote (PMV) to describe the thermal comfort index. The original range of the PMV index is [−3, +3], and the corresponding thermal sensation is 7 levels of cold, cool, slightly cool, comfortable, slightly warm, warm, and heat. In this paper, the PMV index is redefined according to field working conditions, as shown in
Table 1.
The data collection in the visible light environment is performed by the Likert 5-point scale method. During the production process, visible light intensity data at different positions in the workshop are collected. In this paper, the visible light environment assessment index level is determined according to the measured environment, as shown in
Table 2.
The purpose of data selection is to determine the operation object of the discovery task, which means extracting relevant data from the original database to form the target data according to the needs of users.
where
N1(t) is the object, c is the feature name of the object, v is the magnitude of
N1(t) on c, and t is a general parameter. For example, for SEDCI,
N1(t) = hydrogen production workshop A, c
1 = hydrogen concentration, c
2 = oxygen concentration, c
3 = hydrogen sulfide concentration..., v
n(t) is the magnitude corresponding to c
n. At the same time, four hydrogen workshops in the factory are selected as examples to select the following indexes for the data from March 2019 to August 2019 as the SEDCI. SEDCI is the real monitoring data for gas chemical production workshops in the real world. The workshops monitor and control the production behaviors of the systems by integrating digits and equipment. The collected data contain the following information: (1) the running data of production system between March and August (excluding equipment maintenance and time outside the production schedule); (2) the data of 2000 collection points in four hydrogen workshops are collected (one point represents a collected attribute, such as the temperature of equipment is a point). Since there are a large number of attributes irrelevant to safety evaluation in the collected points, we have screened these attributes under the guidance of experts; (3) for production-related data, the read frequency from the real-time database is once per second; (4) for environment-related data, we collect them every five minutes.
2.2. Data Preprocessing
In modern society, data are the necessary foundation for enterprises to progress to informationalization. However, with the rapid expansion of enterprise application system data, the emergence of new applications, and the integration of applications, data quality problems have become increasingly prominent. These problems are mainly reflected in incorrect data, incomplete data, inconsistent data, and other aspects. Poor-quality data have become an important factor affecting the correct decision-making, safety prevention, and dangerous source investigation of chemical enterprises. Therefore, data quality management will become an essential link in the informationalization process of chemical enterprises.
Data analysis and mining rely on real and accurate data, and the quality of data affects the success or failure of data application. There are many reasons for the low data quality of chemical factories studied in this paper, including careless input of original data, low accuracy of equipment, interference of the external environment, data packet loss, and dislocation of data integration. Data collection in workshops is frequent and varied, and data quality problems are more prominent. Due to the large amount of data, the complex correlation between data and the diversification of data structure, the consistency and integrity of data are difficult to maintain. This is a potential vulnerability for obtaining knowledge-assisted security decision-making through data mining. Data preprocessing processes the extracted data
R1(t) to meet the requirements of data mining. Its main work includes data filling, data deduplication, outlier data deletion, derivation calculation of missing data, and data type conversion. This step is mainly studying the quality of data and finding datasets that meet the requirements and can be effectively mined to prepare for further analysis. The dataset can be represented by the multidimensional matter element
R2:
At present, the problem of data quality has become an important factor affecting the application of data mining. Due to the existence of incorrect, incomplete, redundant, or sparse data, the credibility of the final mining conclusion is reduced. For example, in SECP, abnormal data are often generated due to incorrect input of transcribing information and other factors. At this time, these abnormal data must first be eliminated and cleaned, and then data mining is performed. Otherwise, data mining cannot be performed. Even if the data mining is made, the accuracy of conclusion is also very low, and it greatly affects the application value. However, data cleaning work often takes considerable time. With the increase in data, new information with possible data quality problems is imported into the database every time and every day. Data cleaning must be carried out continuously to ensure the data quality used for data mining.
Since the SEDCI is mainly sensor data, there are discrete numerical data and sinusoidal electrical signals. First, the quartile method is adopted in this paper for discrete data, and the daily data are regarded as an individual. The range of outliers of data is determined by historical data, as shown in
Figure 2. The outlier data and daily data missing by more than 50% are deleted, and the missing value is interpolated by cubic spline interpolation. For sinusoidal signals, the collected signal data are converted into the time domain to the frequency domain, and error data, such as peak load shifting and abnormal cycles, are removed. The first digit of the daily data individual is taken as the first peak position. The last bit is taken as the last trough position. Daily individual data are interpolated to the same length, and then the data are inversely transformed from the time domain to the frequency domain. Thus, an extension set is established on the raw dataset.
The matter element extension model in extenics is an evaluation method to solve the fuzziness, diversity, and incompatibility of evaluation objects, but there are also imperfections in theory and application. For example, when the index data exceed the controlled field, the dependent function cannot be calculated and, thus, cannot be evaluated. At the same time, due to the influence of each index unit, an excessively large magnitude difference between indexes will easily affect the accuracy of the analysis results. To make each index more convenient for scientific induction, the data should be normalized. The influence of each index on the workshop safety environment is both positive and negative, so different normalization treatments should be carried out. For the indicator of positive effect, the expression is:
Additionally, for the indicator of negative effect:
The data of each index value after data preprocessing are evaluated by the extension method of point to interval. The correct, complete, and consistent set of data individuals is selected from the raw dataset so that the data quality can meet the requirements of effectively mining and determining the weight of the model to ensure the balance between the available quality of data and the original information of data. Therefore, the matter element model of the workshop safety evaluation index data quality is denoted as:
The judgment standards, such as correctness, integrity, and consistency, are as follows. The score of each index is the ratio of counting to total number after querying according to the standard:
- (1).
column not null (weight: 9, expected value: 90): integrity;
- (2).
column: reach the specified length (weight: 10, expected value: 90): effectiveness;
- (3).
column: value is within the standard range (weight: 10, expected value: 98): effectiveness.
According to the definition of the extension set and the type of extension transform, the domain extension transform, association rule extension transform, and element extension transform are carried out to select the dataset for mining.
4. Evaluation and Analysis of Safety and Environmental Data of Each Workshop in Chemical Enterprises
According to the chemical factory safety extension prerisk model, the real scene and data of the specialty gas production workshop in a chemical plant are taken as an example for calculation and analysis. The chemical enterprise area studied in this paper is located in northern China. The city has a temperate continental monsoon climate. The spring is varied, windy and dry, the summer is hot and has less rainfall, the autumn is mild, the winter is cold and dry, the annual sunshine time is long, and the sunshine intensity is high. The factory mainly produces hydrogen, nitrogen trifluoride, tungsten hexafluoride, trifluorome- thanesulfonic acid, high-purity gas, and mixed gas. The annual output can reach 7300 tons of specialty gas and 80,000 tons of liquid nitrogen. The production flow chart of a workshop in the chemical plant is shown in
Figure 3.
The production workshop has a large area and high lifting frame; gas monitoring sensors need to be set in layers and sections (the data collection scheme is shown in
Figure 4), and the working environment is characterized by high noise, high temperature, and dark lighting. The warehouse in the factory stores inflammable, explosive and toxic raw materials, semifinished products, and finished products. Once an accident happens, it will cause great harm to the life and safety of personnel. Gas leakage accidents are the most common in the factory, while explosion accidents are the most harmful. An explosion will cause a significant impact to the whole city, residents, and surrounding areas, and the leaked toxic gases will cause harm to the factory area and surrounding people. At the same time, the flammable, explosive, and toxic chemical raw materials and complicated fire conditions in the factory area will greatly hinder the effective implementation of rescue work by firefighters.
Through the calculation of Formula (7), the weight of each index of safety and environmental data of chemical enterprises are shown in
Table 3:
The classical field and controlled field of safety and environmental data evaluation of each workshop in chemical enterprises are shown in the following
Table 4 and
Table 5:
According to the actual situation of the chemical factory studied in this paper, the index value of the matter element to be evaluated is determined. Taking hydrogen production in the first workshop as an example, the index value is:
Taking hydrogen concentration in indoor safety and environmental indexes as an example, the correlation degree is calculated as follows:
Nonrisk:
First, the interval distance values of hydrogen concentration calculated by Formulas (11) and (12) are:
Second, by using Formula (13), the position value is:
Finally, Formula (14) is used to obtain the correlation degree when there is nonrisk:
Similarly, the correlation degree of light risk, medium risk, and heavy risk can be calculated, and the results are −0.25, −0.52, and −0.68, respectively.
The correlation degree of each second index in indoor safety and environmental and outdoor meteorological indexes are obtained, as shown in
Table 6 and
Table 7.
Therefore, according to Formula (15), the comprehensive correlation degree between indoor safety and environmental indexes and outdoor meteorological indexes is calculated as follows:
By the above calculation, the weight and correlation degree of the two first-level indexes can be obtained. Therefore, the comprehensive correlation degree of the first workshop of safety and environmental data of chemical enterprises can be calculated according to Formula (15), as shown in the following:
Finally, according to Formula (16), the prerisk level of the first workshop of safety and environmental data of chemical enterprises is determined to be nonrisk. Similarly, the comprehensive correlation degree of the second workshop, third workshop, and fourth workshop can be obtained as follows:
In the end, the prerisk levels of the four workshops are nonrisk, light-risk, light-risk, and nonrisk.
5. Discussion
Risk analysis of chemical enterprises is an important part of safety production and safety management of chemical enterprises, which is used to improve the level of safety management of chemical enterprises. Therefore, the safety analysis method should not only be scientific, reasonable, and clear but also more objective and truly reflect the safety status of chemical enterprises, which is a problem to be solved at present. In this paper, the safety production factors of chemical enterprises are investigated and analyzed, different levels of safety indicators are selected as evaluation indicators, and extension engineering theory is used to establish a risk early warning model for chemical enterprises. The selection of safety evaluation indexes is the basis for establishing the safety evaluation method of chemical enterprises. The safety risk analysis of chemical enterprises involves personnel, equipment, and facilities, the operating environment, safety management, and other aspects in the production process. Safety workers at home and abroad have made great efforts in this regard. However, the production of chemical products is a complex system for the production and management of chemical enterprises, involving strict processing conditions, complete equipment and facilities, and numerous dangerous and harmful factors. It is difficult for a single index to fully reflect the actual safety status of the enterprise. It is necessary to select the best influencing variables that affect the safe production of chemical enterprises from different levels and play a dominant role in the nature of the variables. Each index should follow the principles of practicality, representativeness, scientificity, systematicness, and the combination of being qualitative and quantitative as much as possible.
Once the degree of damage for each trigger has been assessed, this information can be aggregated into a single risk index, and each estimate can be combined with the corresponding probability of occurrence. We use the safety evaluation model to comprehensively evaluate the safety status of a typical chemical enterprise in Hebei, China. The chemical production process involves a variety of dangerous chemical substances, and the intermediate and finished products in the production process are usually toxic, flammable, and explosive. Therefore, the plant area of chemical plants is usually identified as a major source of danger, and chemical accidents will not only threaten the enterprise itself but also threaten the local city. According to the risk early warning model proposed in this paper, the prerisk level of the final safety evaluation result is light-risk. Combined with the actual situation of the enterprise, this paper puts forward four feasible safety risk management suggestions for the safety production and risk management of the chemical plant: (1) further strengthen personnel operation standards; (2) indicators with low scores in the model represent potential risks. It is suggested that enterprises conduct risk investigations and focus monitoring to improve the current inappropriate production modes; (3) strengthen gas concentration monitoring at pipelines; and (4) strengthen the management of electrolytic equipment. Therefore, a theoretical basis for the safety production analysis of such chemical enterprises is provided through the risk early warning model of chemical enterprises in this paper.