DdERT: Research on Named Entity Recognition for Mine Hoist Using a Chinese BERT Model
Abstract
:1. Introduction
- Constructing a data set in the field of mine hoist faults: We conducted an innovative study to explore the dataset of mine hoist faults in detail, which has not been fully filled in the current field. To date, there is no dataset that covers both entity recognition and fault diagnosis. This study closely combines internal data from mining companies and the publicly available literature, and carefully constructs a high-quality dataset of mine hoist fault data through manual annotation. This dataset not only provides detailed references, but also greatly promotes research and development in this field. Although this dataset cannot be publicly released, it provides strong support for relevant research and is widely used in academic research;
- Construct the DdERT model: To embed the elevator fault dictionary BERT model, a modified coding layer fusion algorithm is combined with a dictionary for each input character for training. At the same time, conditional random field (CRF) [9] was used as the model classifier to alleviate the imbalance problem between samples in the mine lifting field and further improve the effect of the DdERT model. Compared with other models, the F1 value of the DdERT model was greatly improved.
2. Related Work
3. Model
- (1)
- Input layer: The input layer processes the input text. In this layer, we employ the BERT model, which can map the individual words and terms in the input text to their corresponding vector representations. These vector representations, including character-level and word-level vectors, are combined as the input to the model.
- (2)
- Dictionary fusion encoding layer: The core purpose of this layer is to fuse the character-level feature vectors and the word-level feature vectors. To achieve this, the lexicon fusion encoding layer interjects a lexicon adapter between the two transformer encoders. By scanning the dictionary adapter, the model can match the character-level feature vectors with the corresponding word-level feature vectors and learn how to fuse the two feature vectors. This fusion helps the model to identify entity boundaries more accurately and improve its encoding accuracy of semantic information.
- (3)
- Decoding Layer: The decoding layer further processes the character-level and word-level feature vectors based on the dictionary fusion encoding layer. After the composite operation of multiple transformer layers and dictionary adapters, the model can learn the fusion weights of these feature vectors. Then, CRF is used to fine-tune the entity recognition results of the model to output the globally optimal annotation sequence.
3.1. Algorithm for Building Domain Dictionary
Algorithm 1: Dictionary tree construction algorithm |
Input: , a sentence with n characters |
Output: |
1. Initialize the word list |
2.Initialize the dictionary tree root node |
3.for to n do |
4. repeat |
5. Iterate through the word list to obtain the words matching the matching words |
6. Take constructing dictionary tree sub-nodes |
7. until the traversal is complete stop |
8.end for |
3.2. The Dictionary-Enhancement BERT
- (1)
- Subsequence Matching: For a given input sentence S, all character subsequences are traversed and matched against a pre-constructed dictionary tree (feature word repository) using a dictionary fusion algorithm. Each character can align with up to three feature words, with any shortfall filled by the special symbol <PAD>.
- (2)
- BERT Embedding: The dictionary-augmented BERT model embeds the character-word pairs, generating the corresponding word vectors.
- (3)
- Transformer Processing: The word vectors from the previous step are fed into BERT’s transformer encoders. To incorporate domain-specific insights, a dictionary adapter is interposed between consecutive layers of the transformer, aiming to infuse domain-specific lexical information.
- (4)
- Lexicon Adaptation: A dictionary adapter performs transformations upon specific layers’ outputs, integrating lexical insights into the feature vector.
- (5)
- Subsequent Processing: Post-transformation, these feature vectors proceed through additional transformer layers, allowing the model to comprehend better and represent the textual data.
3.3. Dictionary Fusion Algorithm
Algorithm 2: Dictionary Fusion algorithm |
Input: |
Output: |
1. for to n do |
for to n do |
end for |
end for |
2. A nonlinear transformation of the word embedding vector , yields |
3. for to n do |
4. Calculating word and domain feature word correlation |
5. Calculate the weighted sum of all words |
6. Weighted lexical information injection character vector |
7. end for |
3.4. Decoding Layer
- (1)
- Calculating probability scores
- (2)
- The probability of a sequence of labels
- (3)
- Loss function
4. Dataset
4.1. Data Analysis
4.1.1. Design of Fault Diagnosis Ontology Model Structure
4.1.2. Data Entity Type
4.2. Data Sources
5. Experiment
5.1. Evaluating Indicator
5.2. Experimental Parameters
5.3. Comparative Experimental
5.4. Comparative Experimental
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Senapati, B.R.; Khilar, P.M.; Swain, R.R. Composite fault diagnosis methodology for urban vehicular ad hoc network. Veh. Commun. 2021, 29, 100337. [Google Scholar] [CrossRef]
- Zhou, X.; Sun, J.; Cui, P.; Lu, Y.; Lu, M.; Yu, Y. A Fast and Robust Open-Switch Fault Diagnosis Method for Variable-Speed PMSM System. IEEE Trans. Power Electron. 2020, 36, 2598–2610. [Google Scholar] [CrossRef]
- Liu, Z.; Xiao, W.; Cui, J.; Mei, L. Application of an information fusion method to the incipient fault diagnosis of the drilling permanent magnet synchronous motor. J. Pet. Sci. Eng. 2022, 219, 111124. [Google Scholar] [CrossRef]
- Ji, S.; Pan, S.; Cambria, E.; Marttinen, P.; Philip, S.Y. A Survey on Knowledge Graphs: Representation, Acquisition, and Applications. IEEE Trans. Neural Netw. Learn. Syst. 2021, 33, 494–514. [Google Scholar] [CrossRef] [PubMed]
- Wei, Z.; Su, J.; Wang, Y.; Tian, Y.; Chang, Y. A Novel Cascade Binary Tagging Framework for Relational Triple Extraction. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Virtual, 5–7 July 2020. [Google Scholar]
- Tiddi, I.; Schlobach, S. Knowledge graphs as tools for explainable machine learning: A survey. Artif. Intell. 2022, 302, 103627. [Google Scholar] [CrossRef]
- Lv, K.; Gao, C.; Si, J.; Feng, H.; Cao, W. Fault Coil Location of Inter-Turn Short-Circuit for Direct-Drive Permanent Magnet Synchronous Motor Using Knowledge Graph. IET Electr. Power Appl. 2020, 14, 1712–1721. [Google Scholar] [CrossRef]
- Li, J.; Sun, A.; Han, J.; Li, C. A Survey on Deep Learning for Named Entity Recognition. IEEE Trans. Knowl. Data Eng. 2020, 34, 50–70. [Google Scholar] [CrossRef]
- Nuo, Y.; Yan, X.; Yu, Z.; Huang, S.; Guo, J. A Khmer NER method based on conditional random fields fusing with Khmer entity characteristics constraints. In Proceedings of the 2017 29th Chinese Control and Decision Conference (CCDC), Chongqing, China, 28–30 May 2017. [Google Scholar]
- Eftimov, T.; Seljak, B.K.; Koroec, P. A rule-based named-entity recognition method for knowledge extraction of evidence-based dietary recommendations. PLoS ONE 2017, 12, e0179488. [Google Scholar] [CrossRef]
- Li, X.; Lv, X.; Liu, K. Automatic Recognition of Chinese Location Entity. In International Conference Natural Language Processing; Springer: Berlin/Heidelberg, Germany, 2014. [Google Scholar]
- Fang, X.; Sheng, H. A Hybrid Approach for Chinese Named Entity Recognition. In International Conference on Discovery Science; Springer: Berlin/Heidelberg, Germany, 2002. [Google Scholar]
- Tsai, R.T.H.; Wu, S.H.; Lee, C.W.; Shih, C.W.; Hsu, W.L. Mencius: A Chinese Named Entity Recognizer Using Maximum Entropy-based Hybrid Model. Int. J. Comput. Linguist. Chin. Lang. Process. 2004, 9, 65–82. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Souza, F.; Nogueira, R.; Lotufo, R. Portuguese Named Entity Recognition using BERT-CRF. arXiv 2019, arXiv:1909.10649. [Google Scholar]
- Li, X.; Zhang, H.; Zhou, X.H. Chinese Clinical Named Entity Recognition with Variant Neural Structures Based on BERT Methods. J. Biomed. Inform. 2020, 107, 103422. [Google Scholar] [CrossRef]
- Wu, Y.; Huang, J.; Xu, C.; Zheng, H.; Zhang, L.; Wan, J. Research on Named Entity Recognition of Electronic Medical Records Based on RoBERTa and Radical-Level Feature. Wirel. Commun. Mob. Comput. 2021, 2021, 2489754. [Google Scholar] [CrossRef]
- Huang, Z.; Xu, W.; Yu, K. Bidirectional LSTM-CRF Models for Sequence Tagging. arXiv 2015, arXiv:1508.01991. [Google Scholar]
- Liang, C.; Yu, Y.; Jiang, H.; Er, S.; Wang, R.; Zhao, T.; Zhang, C. BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Virtual, 6–10 July 2020; ACM: New York, NY, USA, 2020. [Google Scholar]
- Liu, P.; Tian, B.; Liu, X.; Gu, S.; Yan, L.; Bullock, L.; Ma, C.; Liu, Y.; Zhang, W. Construction of Power Fault Knowledge Graph Based on Deep Learning. Appl. Sci. 2022, 12, 6993. [Google Scholar] [CrossRef]
- Baigang, M.; Yi, F. A review: Development of named entity recognition (NER) technology for aeronautical information intelligence. Artif. Intell. Rev. 2022, 56, 1515–1542. [Google Scholar] [CrossRef]
- Liu, W.; Fu, X.; Zhang, Y.; Xiao, W. Lexicon Enhanced Chinese Sequence Labeling Using BERT Adapter. arXiv 2021, arXiv:2105.07148. [Google Scholar]
- Li, H.; Fu, Y.; Yan, Y.; Li, J. Construction of Multi-modal Domain Knowledge Graph Based on LEBERT. Comput. Syst. Appl. 2022, 31, 79–90. [Google Scholar]
- Wu, G.; Fan, C.; Tao, G.; He, Y. Entity recognition of electronic medical records based on LEBERT-BCF. Comput. Era 2023, 2, 92–97. [Google Scholar]
- Li, J.; He, N.; Zhang, J.; Wang, Q.; Li, H. Fault diagnosis method for mine hoisting motor based on VMD and CNN-BiLSTM. J. Mine Autom. 2023, 49, 49–59. [Google Scholar]
- Ruan, K.; Kou, Z.; Wang, Y.; Wu, J. Digital twin rapid construction method of a mining hoisting System. Coal Sci. Technol. 2022, 51, 1–13. [Google Scholar]
- Guo, X.; Li, J.; Miao, D.; Li, B. A fault early warning model of mine hoist based on LSTM-Adam. J. Mech. Electr. Eng. 2023. Available online: https://kns.cnki.net/kcms/detail/33.1088.TH.20230823.1347.004.html (accessed on 29 August 2023).
- Cao, H. Construction and Application of Remote Monitoring System of the Mine Hoist. Autom. Appl. 2023, 64, 104–105+108. [Google Scholar]
- Goldberg, Y.; Levy, O. word2vec Explained: Deriving Mikolov et al.’s negative-sampling word-embedding method. arXiv 2014, arXiv:1402.3722. [Google Scholar]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient Estimation of Word Representations in Vector Space. arXiv 2013, arXiv:1301.3781. [Google Scholar]
- Ferragina, P.; González, R.; Navarro, G.; Venturini, R. Compressed text indexes: From theory to practice. J. Exp. Algorithmics 2009, 13, 1–12. [Google Scholar] [CrossRef]
- Liu, C.; Yu, Y.; Li, X.; Wang, P. Application of Entity Relation Extraction Method Under CRF and Syntax Analysis Tree in the Construction of Military Equipment Knowledge Graph. IEEE Access 2020, 8, 200581–200588. [Google Scholar] [CrossRef]
- Uschold, M.; Gruninger, M. Ontologies: Principles, methods and applications. Knowl. Eng. Rev. 1996, 11, 93–136. [Google Scholar] [CrossRef]
- Gruninger, M.; Fox, M.S. Methodology for the Design and Evaluation of Ontologies. In Proceedings of the Workshop on Implemented Ontologies, European Conference on Artificial Intelligence (ECAI), Amsterdam, The Netherlands, 8–9 August 1994. [Google Scholar]
- Yang, Y.J.; Xu, B.; Hu, J.W.; Tong, M.H.; Zhang, P.; Zheng, L. Accurate and efficient method for constructing domain knowledge graph. J. Softw. 2018, 29, 2931–2947. [Google Scholar]
- GB/T 35737-2017; Multi-Rope Mine Drum Hoist. Mining Machinery Standardization Technical Committee of the People’s Republic of China: Beijing, China, 2017.
- GB/T 20961-2018; Single-Rope Mine Drum Hoist. Mining Machinery Standardization Technical Committee of the People’s Republic of China: Beijing, China, 2018.
Entity Type | Illustrate |
---|---|
Fault Phenomenon | External manifestations of faults |
Repair Measures | Various measures taken to restore the fault to its original usable state |
Fault Effect | Result of a failure mode on the use, function, or state of a product |
Fault Cause | Factors related to design, manufacturing, use, and maintenance that cause malfunctions |
Fault Location | The location where the fault occurred |
Parameter Name | Value |
---|---|
Word2vec word vector dimensionality | 200 |
Learning rate | 1 × 10−5 |
Batch size | 16 |
Random discard rate | 0.1 |
Hidden layer dimension | 768 |
Number of attention mechanism heads | 12 |
Maximum sentence length | 128 |
Maximum fusion of vocabulary information per Chinese character | 3 |
Environment | Experimental Configuration |
---|---|
GPU | NVIDIA Tesla P100 |
CPU | Intel Xeon (Cascade Lake) Gold 6240 |
Memory | 64 |
Programming language | 3.8 |
Training framework | Pytorch 1.8.1+cu92 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dang, X.; Wang, L.; Dong, X.; Li, F.; Deng, H. DdERT: Research on Named Entity Recognition for Mine Hoist Using a Chinese BERT Model. Electronics 2023, 12, 4037. https://doi.org/10.3390/electronics12194037
Dang X, Wang L, Dong X, Li F, Deng H. DdERT: Research on Named Entity Recognition for Mine Hoist Using a Chinese BERT Model. Electronics. 2023; 12(19):4037. https://doi.org/10.3390/electronics12194037
Chicago/Turabian StyleDang, Xiaochao, Li Wang, Xiaohui Dong, Fenfang Li, and Han Deng. 2023. "DdERT: Research on Named Entity Recognition for Mine Hoist Using a Chinese BERT Model" Electronics 12, no. 19: 4037. https://doi.org/10.3390/electronics12194037
APA StyleDang, X., Wang, L., Dong, X., Li, F., & Deng, H. (2023). DdERT: Research on Named Entity Recognition for Mine Hoist Using a Chinese BERT Model. Electronics, 12(19), 4037. https://doi.org/10.3390/electronics12194037