B-TBM: A Novel Deep Learning Model with Enhanced Loss Function for HAZOP Risk Classification Using Natural Language Statistical Laws
Abstract
:1. Introduction
- 1.
- We assess the complexity of risks represented by HAZOP language, providing a case study for other industrial practices;
- 2.
- We introduce a novel risk classification model that ingeniously leverages BERT and incorporates a newly proposed loss function;
- 3.
- Extensive experimentation substantiates the effectiveness of the model, serving as a valuable tool for risk analysis by expert groups, engineers, and other enterprises.
2. Related Work
2.1. Risk Classification
2.2. Statistical Laws of Natural Language
3. Method
3.1. Complexity Measurement
3.2. B-TBM
3.2.1. BERT Layer: Capturing Multi-Level Semantic and Contextual Information
3.2.2. BiLSTM Layer: Capturing Long-Term Dependency Information
3.2.3. TextCNN Layer: Capturing Local Contextual Information
3.2.4. Feature Splicing and Fully Connected Layer
3.2.5. LFCF Loss Function
4. Experiment
4.1. Data Description
4.2. Experimental Setup
4.3. Metric
5. Result
5.1. HAZOP Complexity
5.2. Performance of B-TBM
5.2.1. Accuracy
5.2.2. Confusion Matrix
5.2.3. ROC Curve
5.2.4. Loss Functions
5.2.5. Recall and F1
6. Disccussion
6.1. Practical Applications
- 1.
- It can assist teams of experts performing safety analysis during raw processing to help identify risks in decision making. Complex interconnections and cause and effect relationships present HAZOP challenges, especially when dealing with new processes that require significant labor and time. Our classification system helps mitigate this challenge as a valuable aid. During the calibration of analysis results, experts can strategically cross-validate risks, noting any inconsistencies with the inferences provided by the classification system. This approach not only reduces labor costs, but also reduces the potential for human error.
- 2.
- It can assist engineers during operations that are already in production, where additional risks may arise that have not yet been analyzed by a team of experts. Fortunately, engineers are supported by our classification system, which enables them to carry out qualitative analyses in advance and quickly take appropriate solutions and subsequent management measures.
- 3.
- It can give guidance for related businesses, providing guidance for the initiation of HAZOP for processes in affiliated businesses, especially small-scale businesses and independent processes that often conduct security analyses that may lack comprehensiveness. Our categorization system incorporates HAZOP knowledge from large-scale enterprises, enabling it to provide effective guidance that enhances the reliability of these enterprises’ processes.
6.2. Outlook
7. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A
No. | Description | Severity | Possibility |
Risk#1 | The liquid level in the liquid separation tank D-5612118 of the gas compressor is too high, causing the compressor to carry liquid and equipment damage. In severe cases, it can cause process hazards and excessive liquid entrainment in the upstream gas phase | 2 | 2 |
Risk#2 | The liquid level of the underground solvent tank V-9306 in the public engineering section may be low, and the false indication of the liquid level in the underground solvent tank may cause pump evacuation in severe cases | 3 | 4 |
References
- Suzuki, T.; Izato, Y.; Miyake, A. Identification of accident scenarios caused by internal factors using HAZOP to assess an organic hydride hydrogen refueling station involving methylcyclohexane. J. Loss Prev. Process Ind. 2021, 71, 104479. [Google Scholar] [CrossRef]
- Zhu, L.; Ma, H.; Huang, Y.; Liu, X.; Xu, X.; Shi, Z. Analyzing construction workers’ unsafe behaviors in hoisting operations of prefabricated buildings using HAZOP. Int. J. Environ. Res. Public Health 2022, 19, 15275. [Google Scholar] [CrossRef] [PubMed]
- Ahn, J.; Chang, D. Fuzzy-based HAZOP study for process industry. J. Hazard. Mater. 2016, 317, 303–311. [Google Scholar] [CrossRef] [PubMed]
- Meng, Y.; Song, X.; Zhao, D.; Liu, Q. Alarm management optimization in chemical installations based on adapted HAZOP reports. J. Loss Prev. Process Ind. 2021, 72, 104578. [Google Scholar] [CrossRef]
- Dunjó, J.; Fthenakis, V.; Vílchez, J.A.; Arnaldos, J. Hazard and operability (HAZOP) analysis. A literature review. J. Hazard. Mater. 2010, 173, 19–32. [Google Scholar] [CrossRef]
- Yousofnejad, Y.; Afsari, F.; Es’haghi, M. Dynamic risk assessment of hospital oxygen supply system by HAZOP and intuitionistic fuzzy. PLoS ONE 2023, 18, e0280918. [Google Scholar] [CrossRef]
- Cheraghi, M.; Eslami Baladeh, A.; Khakzad, N. Optimal selection of safety recommendations: A hybrid fuzzy multi-criteria decision-making approach to HAZOP. J. Loss Prev. Process Ind. 2022, 74, 104654. [Google Scholar] [CrossRef]
- Wu, J.; Song, M.; Zhang, X.; Lind, M. Safeguards identification in computer aided HAZOP study by means of multilevel flow modelling. Proc. Inst. Mech. Eng. Part O J. Risk Reliab. 2023, 237, 922–946. [Google Scholar] [CrossRef]
- Zhang, H.; Zhang, B.; Gao, D. A new approach of integrating industry prior knowledge for HAZOP interaction. J. Loss Prev. Process Ind. 2023, 82, 105005. [Google Scholar] [CrossRef]
- Xu, K.; Hu, J.; Zhang, L.; Chen, Y.; Xiao, R.; Shi, J. A risk factor tracing method for LNG receiving terminals based on GAT and a bidirectional LSTM network. Process Saf. Environ. Prot. 2022, 170, 694–708. [Google Scholar] [CrossRef]
- Ricketts, J.; Pelham, J.; Barry, D.; Guo, W. An NLP framework for extracting causes, consequences, and hazards from occurrence reports to validate a HAZOP study. In Proceedings of the 2022 IEEE/AIAA 41st Digital Avionics Systems Conference (DASC), Portsmouth, VA, USA, 18–22 September 2022; pp. 1–8. [Google Scholar]
- Jia, Y.; Lawton, T.; McDermid, J.; Rojas, E.; Habli, I. A framework for assurance of medication safety using machine learning. arXiv 2021, arXiv:2101.05620. [Google Scholar]
- Wang, Z.; Ren, M.; Gao, D.; Li, Z. A Zipf’s law-based text generation approach for addressing imbalance in entity extraction. J. Inf. 2023, 17, 101453. [Google Scholar] [CrossRef]
- Peng, L.; Gao, D.; Bai, Y. A study on standardization of security evaluation information for chemical processes based on deep learning. Processes 2021, 9, 832. [Google Scholar] [CrossRef]
- Zhao, Y.; Zhang, B.; Gao, D. Construction of petrochemical knowledge graph based on deep learning. J. Loss Prev. Process Ind. 2022, 76, 104736. [Google Scholar] [CrossRef]
- Zhang, M.; Chen, W.; Zhang, Y.; Liu, F.; Yu, D.; Zhang, C.; Gao, L. Fault diagnosis of oil-immersed power transformer based on difference-mutation brain storm optimized CatBoost model. IEEE Access 2021, 9, 168767–168782. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Human Language Technologies, Volume 1 (Long and Short Papers), Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics, Minneapolis, MN, USA, 2–7 June 2019; Association for Computational Linguistics: Stroudsburg, PA, USA, 2019. [Google Scholar]
- Kim, Y. Convolutional neural networks for sentence classification. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 25–29 October 2014; Association for Computational Linguistics: Doha, Qatar, 2014; pp. 1746–1751. [Google Scholar]
- Graves, A.; Schmidhuber, J. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Netw. 2005, 18, 602–610. [Google Scholar] [CrossRef] [PubMed]
- Joachims, T. Text categorization with support vector machines: Learning with many relevant features. In Proceedings of the 10th European Conference on Machine Learning (ECML ’98), Chemnitz, Germany, 21–23 April 1998; Springer: Berlin/Heidelberg, Germany, 1998; pp. 137–142. [Google Scholar]
- Fix, E.; Hodges, J.L. Discriminatory Analysis: Nonparametric Discrimination: Consistency Properties. Int. Stat. Rev. 1989, 57, 238–247. [Google Scholar] [CrossRef]
- Rumelhart, D.E.; Hinton, G.E.; Williams, R.J. Learning representations by back-propagating errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Liu, T.Y. LightGBM: A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 2017, 30, 3146–3154. [Google Scholar]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Prokhorenkova, L.; Gusev, G.; Vorobev, A.; Dorogush, A.V.; Gulin, A. CatBoost: Unbiased boosting with categorical features. Adv. Neural Inf. Process. Syst. 2018, 31, 6638–6648. [Google Scholar]
- Jawahar, G.; Sagot, B.; Seddah, D. What does BERT learn about the structure of language? In Proceedings of the ACL 2019—57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019. [Google Scholar]
- Samela, C.; Carisi, F.; Domeneghetti, A.; Petruccelli, N.; Castellarin, A.; Iacobini, F.; Brath, A. A methodological framework for flood hazard assessment for land transport infrastructures. Int. J. Disaster Risk Reduct. 2023, 85, 103491. [Google Scholar] [CrossRef]
- Akay, H. Flood hazards susceptibility mapping using statistical, fuzzy logic, and MCDM methods. Soft Comput. 2021, 25, 9325–9346. [Google Scholar] [CrossRef]
- Li, Y.; Wang, H.; Bai, K.; Chen, S. Dynamic intelligent risk assessment of hazardous chemical warehouse fire based on electrostatic discharge method and improved support vector machine. Process Saf. Environ. Prot. 2021, 145, 425–434. [Google Scholar] [CrossRef]
- Tian, D.; Li, M.; Han, S.; Shen, Y. A novel and intelligent safety-hazard classification method with syntactic and semantic features for large-scale construction projects. J. Constr. Eng. Manag. 2022, 148, 04022109. [Google Scholar] [CrossRef]
- Wang, F.; Gu, W.; Bai, Y.; Bian, J. A method for assisting the accident consequence prediction and cause investigation in petrochemical industries based on natural language processing technology. J. Loss Prev. Process Ind. 2023, 83, 105028. [Google Scholar] [CrossRef]
- Feng, X.; Dai, Y.; Ji, X.; Zhou, L.; Dang, Y. Application of natural language processing in HAZOP reports. Process Saf. Environ. Prot. 2021, 155, 41–48. [Google Scholar] [CrossRef]
- Zhang, F.; Wang, B.; Gao, D.; Yan, C.; Wang, Z. When grey model meets deep learning: A new hazard classification model. Inf. Sci. 2024, 670, 120653. [Google Scholar] [CrossRef]
- Wang, Z.; Wang, B.; Ren, M.; Gao, D. A new hazard event classification model via deep learning and multifractal. Comput. Ind. 2023, 147, 103875. [Google Scholar] [CrossRef]
- Ekramipooya, A.; Boroushaki, M.; Rashtchian, D. Predicting possible recommendations related to causes and consequences in the HAZOP study worksheet using natural language processing and machine learning: BERT, clustering, and classification. J. Loss Prev. Process Ind. 2024, 89, 105310. [Google Scholar] [CrossRef]
- Rezashoar, S.; Kashi, E.; Saeidi, S. A hybrid algorithm based on machine learning (LightGBM-Optuna) for road accident severity classification (case study: United States from 2016 to 2020). Innov. Infrastruct. Solut. 2024, 9, 319. [Google Scholar] [CrossRef]
- Xie, J.; Li, Z.; Zhou, Z.; Liu, S. A novel bearing fault classification method based on XGBoost: The fusion of deep learning-based features and empirical features. IEEE Trans. Instrum. Meas. 2020, 70, 1–9. [Google Scholar] [CrossRef]
- Walczak, M.; Poniszewska-Marańda, A.; Stepień, K. Classification of events in selected industrial processes using weighted key words and K-nearest neighbors algorithm. Appl. Sci. 2023, 13, 10334. [Google Scholar] [CrossRef]
- Orrù, P.F.; Zoccheddu, A.; Sassu, L.; Mattia, C.; Cozza, R.; Arena, S. Machine learning approach using MLP and SVM algorithms for the fault prediction of a centrifugal pump in the oil and gas industry. Sustainability 2020, 12, 4776. [Google Scholar] [CrossRef]
- Wang, F.; Gu, W. Intelligent HAZOP analysis method based on data mining. J. Loss Prev. Process Ind. 2022, 80, 104911. [Google Scholar] [CrossRef]
- Jang, B.; Kim, M.; Harerimana, G.; Kang, S.-u.; Kim, J.W. Bi-LSTM model to increase accuracy in text classification: Combining Word2vec CNN and attention mechanism. Appl. Sci. 2020, 10, 5841. [Google Scholar] [CrossRef]
Model Name | Structural Features | Advantages | Disadvantages | Applicable Scenarios |
---|---|---|---|---|
BERT [17] (Bidirectional Encoder Representations from Transformers) | Bidirectional Transformer, taking context into account | Excellent ability to capture contextual semantics, widely used in a variety of NLP tasks | Almost all text classification tasks, especially suitable for long and complex text | Various text classification tasks, especially complex texts |
TextCNN [18] (Text Convolutional Neural Networks) | Uses convolutional layers to capture n-gram features | Captures local features effectively and is computationally efficient | Short text classification, e.g., news classification, comment classification | Short text classification tasks |
BiLSTM [19] (Bidirectional Long Short-Term Memory) | Processes sequence data through long and short-term memory networks | Suitable for processing sequence-dependent text | Requires processing of contextual tasks, such as long text classification | Long text classification tasks |
SVM [20] (Support Vector Machine) | High-dimensional linear classification | Suitable for small-scale datasets, high efficiency and high accuracy | Requires manual feature extraction and cannot handle contextual information | Small-scale text classification tasks |
KNN [21] (K-Nearest Neighbors) | Instance-based classification | Easy to understand and implement | Computationally inefficient, has difficulty handling large-scale datasets | Small-scale text classification tasks |
MLP [22] (Multilayer Perceptron) | Multilayer perceptron (Neural Network) for various tasks | Suitable for nonlinear classification problems | Requires a lot of data and time for training | Suitable problems with large amounts of data that cannot be solved by linear models |
LightGBM [23] (Light Gradient Boosting Machine) | A framework based on gradient boosting decision trees for large-scale datasets | Fast training, low memory usage, support for category imbalance | Sensitive to hyperparameter tuning | Suitable for scenarios requiring fast training and high performance |
XGBoost [24] (eXtreme Gradient Boosting) | Optimization models based on gradient boosted trees | Robust performance and robustness with automatic missing value handling | Relatively long training time and high model complexity | Suitable for large-scale datasets and scenarios with a high number of features |
CatBoost [25] (Category Boosting) | Decision tree modeling based on gradient boosting | Good processing of class features and fast training speeds | Higher memory requirements and more parameters | Suitable for scenarios with many category features and large amount of data |
Description | Severity | Possibility |
---|---|---|
High temperatures in parts of the diesel output unit and excessive flow of refined diesel fuel, which in severe cases affects the operation of the diesel tank area. | 1 | 2 |
Reaction effluent through the cold high-pressure separator, hydraulic turbine to the bottom of the low-pressure separator outlet pipeline part, cold high-pressure separator V-8107 boundary level is too high, a large number of raw oil with water, system pressure fluctuations, catalyst strength decreased, affecting product quality. | 2 | 4 |
Water injection tank V-8110 liquid level is too high, serious water injection tank V-8110 overpressure, explosion. | 3 | 2 |
Reaction feed through the reaction effluent/reaction feed heat exchanger, reaction feed heating furnace to the reactor inlet pipeline part, the fuel gas flow rate is too high, the fuel gas pipeline network pressure is too high, the temperature of the furnace tube rises, the temperature of the furnace outlet increases, and in severe cases, the reactor fly temperature. | 4 | 3 |
Fractionation tower bottom oil through the bottom of the fractional distillation tower reboiler to return to the tower pipeline part, the bottom of the fractional distillation tower reboiler outlet temperature is high, the bottom of the fractional distillation tower oil flow rate is too low, and in severe cases, the stove pipe burnt. | 5 | 3 |
Description | Severity | Possibility |
---|---|---|
Hydrogen mixing oil is heated by the reaction effluent/mixed feed heat exchanger and reaction feed heater, and then enters the pipeline part of the hydrofinishing reactor. Abnormal inspection and maintenance, burnt and deflected flow of the furnace tube, and in serious cases, burnt the furnace tube, and the plant was shut down. | 3 | 1 |
Reaction to the water injection part of the water injection tank V-8110 liquid level is too high, serious water injection tank V-8110 overpressure, explosion. | 3 | 2 |
Circulating hydrogen through the circulating hydrogen inlet separator tank, circulating hydrogen desulfurization tower and circulating hydrogen compressor inlet separator tank to the circulating hydrogen compressor outlet pipeline part of the underground solvent tank V-8305 liquid level is too high, the accompanying pipeline leakage, and in severe cases, the tank is full to the flare system. | 1 | 3 |
Hydrogen through the new hydrogen dechlorination tank, new hydrogen compressor to the cycle of hydrogen compressor outlet pipeline part, repair new hydrogen compressor C-8102A∼C, the gas valve valve plate failure, cut the machine to repair. | 1 | 4 |
Low-minute oil passes through the reaction effluent/low-minute oil heat exchanger to the main stripper tower and the top return portion of the tower, where the main stripper tower T-8201 pressure is too low and the low-minute oil component becomes heavy and fills the tower. | 2 | 5 |
Model | Method | PDSP | PHHP | ||
---|---|---|---|---|---|
Severity Label Accuracy (%) | Possibility Tag Accuracy (%) | Severity Label Accuracy (%) | Possibility Tag Accuracy (%) | ||
1 | SVM | 74.78 | 74.11 | 78.87 | 57.39 |
2 | KNN | 70.31 | 68.97 | 73.94 | 61.27 |
3 | MLP | 72.10 | 68.71 | 84.86 | 58.10 |
4 | LightGBM | 77.09 | 77.23 | 76.76 | 60.21 |
5 | XGBoost | 74.55 | 77.68 | 75.10 | 59.86 |
6 | CatBoost | 74.55 | 77.01 | 77.46 | 62.98 |
7 | BERT | 80.41 | 81.45 | 86.89 | 65.91 |
8 | BERT+TextCNN1 | 82.17 | 81.98 | 87.69 | 66.33 |
9 | BERT+TextCNN2 | 83.35 | 82.18 | 88.49 | 67.14 |
10 | BERT+BiLSTM | 82.21 | 82.28 | 88.10 | 67.69 |
11 | BERT+TextCNN and BiLSTM1 | 84.01 | 83.09 | 89.18 | 68.68 |
12 | B-TBM | 85.51 | 84.22 | 89.92 | 69.88 |
Index | Method | AUC Values for Severity Labels | AUC Values for Possibility Labels |
---|---|---|---|
1 | SVM | 0.93 | 0.88 |
2 | KNN | 0.86 | 0.79 |
3 | MLP | 0.90 | 0.79 |
4 | LightGBM | 0.93 | 0.88 |
5 | XGBoost | 0.91 | 0.86 |
6 | CatBoost | 0.93 | 0.88 |
7 | BERT | 0.90 | 0.91 |
8 | BERT+TextCNN1 | 0.94 | 0.94 |
9 | BERT+TextCNN2 | 0.93 | 0.94 |
10 | BERT+BiLSTM | 0.94 | 0.95 |
11 | BERT+TextCNN and BiLSTM1 | 0.95 | 0.95 |
12 | B-TBM | 0.95 | 0.96 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xu, B.; Lu, D.; Gao, D.; Zhang, B. B-TBM: A Novel Deep Learning Model with Enhanced Loss Function for HAZOP Risk Classification Using Natural Language Statistical Laws. Processes 2024, 12, 2373. https://doi.org/10.3390/pr12112373
Xu B, Lu D, Gao D, Zhang B. B-TBM: A Novel Deep Learning Model with Enhanced Loss Function for HAZOP Risk Classification Using Natural Language Statistical Laws. Processes. 2024; 12(11):2373. https://doi.org/10.3390/pr12112373
Chicago/Turabian StyleXu, Binxin, Duhui Lu, Dong Gao, and Beike Zhang. 2024. "B-TBM: A Novel Deep Learning Model with Enhanced Loss Function for HAZOP Risk Classification Using Natural Language Statistical Laws" Processes 12, no. 11: 2373. https://doi.org/10.3390/pr12112373
APA StyleXu, B., Lu, D., Gao, D., & Zhang, B. (2024). B-TBM: A Novel Deep Learning Model with Enhanced Loss Function for HAZOP Risk Classification Using Natural Language Statistical Laws. Processes, 12(11), 2373. https://doi.org/10.3390/pr12112373