Defect Severity Identification for a Catenary System Based on Deep Semantic Learning
Abstract
:1. Introduction
- A deep semantic neural network named BERT-DTCN is proposed to effectively extract long-range semantic features and automatically identify defect severity from catenary text records.
- Different from existing domain text representation approaches that extract vector representations with bag-of-words features, we applied BERT to learn word embedding vectors and extract semantic features of domain vocabularies in defective texts. An ablation study on the constructed catenary defect text dataset validates that the generated word embedding vectors contribute to beneficial impacts on the devised text categorization model.
- Based on the obtained defect word embeddings, we used the DTCN to distinguish defect severity degree. Experimental results demonstrate that the proposed algorithm (BERT-DTCN) achieves a superior performance in the binary classification problem (level 1 defect or level 2 defect) over competitive text classification methods, which can reduce the workload of manual discrimination and improve the accuracy and efficiency of classification.
2. Related Work
2.1. Text Classification Methods
2.2. Text Representation Models
3. Methodology
- Catenary defect text database: After obtaining the relevant catenary defect texts accumulated in the data center of the China Railway Administration during its long-time operation and maintenance, it is intended to conduct textual data prepossessing and construct the text dataset.
- Word embedding presentations: The BERT model projects the Chinese texts related to catenary defects into context-aware representations that can be handled and understood by machines.
- Classification of texts to distinguish the defect level: The DTCN module is trained to categorize the catenary defect texts by utilizing equal-width convolution and multiple convolution-residual layers with the pooling layer with stride 2 for downsampling.
3.1. Problem Definition
3.2. Catenary Defect Text Database
3.2.1. Data Source and Text Content
3.2.2. Characteristic Analysis
- Diversity. The operation and maintenance texts for the catenary system contain the time, number, unit, defect component, and defect description.
- Correlation. The operation and maintenance texts are closely linked to the railway transportation, which contains a large number of rail transit terminologies.
- Uncertainty. A great deal of defect descriptions in the catenary texts might be incomplete, noisy, fuzzy, or random.
- Polysemy. Several polysemous words in defect texts might have multiple meanings, which need to be distinguished under different semantic meanings.
3.2.3. Data Processing
3.3. Word Embedding
3.3.1. Input Layer
3.3.2. BERT Encoder
- (1)
- Creating three vectors (i.e., a query vector, a key vector, and a value vector) from each of the encoder’s input vectors and obtaining a weighted score by calculating the dot products of the query with all Keys. It can be calculated as:
- (2)
- Dividing the scores by scaling factor and then normalizing the scores through a softmax operation. It can be represented as:
- (3)
- Multiplying each value vector by the softmax scores and summing up the weighted value vectors. It can be defined as:
3.4. Deep Text Categorization Network
3.4.1. Embedding Layer
3.4.2. Downsampling with the Number of Feature Maps Fixed
3.4.3. Shortcut Connections with Pre-Activation
Algorithm 1 Pseudocode for training the BERT-DTCN. |
|
4. Experiment Results and Analysis
4.1. Training Protocol
4.2. Ablation Study
4.3. Classification Performance Comparison
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Wang, J.; Gao, S.; Yu, L.; Zhang, D.; Ding, C.; Chen, K.; Kou, L. Predicting Wind-Caused Floater Intrusion Risk for Overhead Contact Lines Based on Bayesian Neural Network with Spatiotemporal Correlation Analysis. Reliab. Eng. Syst. Saf. 2022, 225, 108603. [Google Scholar] [CrossRef]
- Wang, J.; Gao, S.; Yu, L.; Zhang, D.; Xie, C.; Chen, K.; Kou, L. Data-Driven Lightning-Related Failure Risk Prediction of Overhead Contact Lines Based on Bayesian Network with Spatiotemporal Fragility Model. Reliab. Eng. Syst. Saf. 2023, 231, 109016. [Google Scholar] [CrossRef]
- Gao, S.; Wang, J.; Yu, L.; Zhang, D.; Zhan, R.; Kou, L.; Chen, K. A Multilayer Bayesian Network Approach-Based Predictive Probabilistic Risk Assessment for Overhead Contact Lines Under External Weather Conditions. IEEE Trans. Transp. Electrif. 2022, 1–18. [Google Scholar] [CrossRef]
- Gao, S. Automatic Detection and Monitoring System of Pantograph-Catenary in China’s High-Speed Railways. IEEE Trans. Instrum. Meas. 2021, 70, 3502012. [Google Scholar] [CrossRef]
- Wang, H.; Liu, Z.; Xu, Y.; Wei, X.; Wang, L. Short Text Mining Framework with Specific Design for Operation and Maintenance of Power Equipment. CSEE J. Power Energy Syst. 2021, 7, 1267–1277. [Google Scholar] [CrossRef]
- Wang, J.; Wang, X.; Ma, C.; Kou, L. A Survey on the Development Status and Application Prospects of Knowledge Graph in Smart Grids. IET Gener. Transm. Distrib. 2021, 15, 383–407. [Google Scholar] [CrossRef]
- Kou, L.; Liu, C.; Cai, G.W.; Zhou, J.N.; Yuan, Q. Data-Driven Design of Fault Diagnosis for Three-Phase PWM Rectifier Using Random Forests Technique with Transient Synthetic Features. IET Power Electron. 2020, 13, 3571–3579. [Google Scholar] [CrossRef]
- Chen, Y.; Lv, Y.; Member, S.; Wang, X.; Li, L.; Member, S.; Wang, F. Detecting Traffic Information from Social Media Texts with Deep Learning Approaches. IEEE Trans. Intell. Transp. Syst. 2018, 20, 3049–3058. [Google Scholar] [CrossRef] [Green Version]
- Wang, Y.; He, Z.; Hu, J. Traffic Information Mining From Social Media Based on the MC-LSTM-CONV Model. IEEE Trans. Intell. Transp. Syst. 2020, 23, 1132–1144. [Google Scholar] [CrossRef]
- Wang, F.; Xu, T.; Tang, T.; Zhou, M.; Wang, H. Bilevel Feature Extraction-Based Text Mining for Fault Diagnosis of Railway Systems. IEEE Trans. Intell. Transp. Syst. 2017, 18, 49–58. [Google Scholar] [CrossRef]
- Brown, D.E. Text Mining the Contributors to Rail Accidents. IEEE Trans. Intell. Transp. Syst. 2016, 17, 346–355. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar]
- Borko, H.; Bernick, M. Automatic Document Classification. J. ACM 1963, 10, 163–222. [Google Scholar] [CrossRef]
- Sasaki, M.; Kita, K. Rule-Based Text Categorization Using Hierarchical Categories. IEEE Int. Conf. Syst. Man Cybern. 1998, 3, 2827–2830. [Google Scholar] [CrossRef]
- Aggarwal, C.C.; Zhai, C.X. A Survey of Text Classification Algorithms; Springer: Berlin/Heidelberg, Germany, 2012; pp. 163–222. [Google Scholar]
- Myaeng, S.H.; Han, K.S.; Rim, H.C. Some Effective Techniques for Naive Bayes Text Classification. IEEE Trans. Knowl. Data Eng. 2006, 18, 1457–1466. [Google Scholar] [CrossRef]
- Jati, W.K.; Kemas Muslim, L. Optimization of Decision Tree Algorithm in Text Classification of Job Applicants Using Particle Swarm Optimization. In Proceedings of the 2020 3rd International Conference on Information and Communications Technology (ICOIACT), Yogyakarta, Indonesia, 24–25 November 2020; pp. 201–205. [Google Scholar] [CrossRef]
- Joachims, T. Text Categorization with Suport Vector Machines: Learning with Many Relevant Features. In Proceedings of the 10th European Conference Machine Learning 1998, Chemnitz, Germany, 21–23 April 1998; pp. 137–142. [Google Scholar]
- Wang, J.; Li, Y.; Shan, J.; Bao, J.; Zong, C.; Zhao, L. Large-Scale Text Classification Using Scope-Based Convolutional Neural Network: A Deep Learning Approach. IEEE Access 2019, 7, 171548–171558. [Google Scholar] [CrossRef]
- Kim, Y. Convolutional Neural Networks for Sentence Classification. In Proceedings of the EMNLP 2014—2014 Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, 25–29 October 2014; pp. 1746–1751. [Google Scholar]
- Liu, P.; Qiu, X.; Huang, X. Recurrent Neural Network for Text Classification with Multi-Task Learning. In Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, New Nork, NY, USA, 9–15 July 2016; pp. 2873–2879. [Google Scholar]
- Lai, S.; Xu, L.; Liu, K.; Zhao, J. Recurrent convolutional neural networks for text classification. In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, Austin, TX, USA, 25–30 January 2015; pp. 2267–2273. [Google Scholar]
- Zhou, P.; Shi, W.; Tian, J.; Qi, Z.; Xu, B. Attention-Based Bidirectional Long Short-Term Memory Networks for Relation Classification. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, 7–12 August 2016; pp. 207–212. [Google Scholar]
- Wang, X.; Kou, L.; Sugumaran, V.; Luo, X.; Zhang, H. Emotion Correlation Mining Through Deep Learning Models on Natural Language Text. IEEE Trans. Cybern. 2020, 51, 4400–4413. [Google Scholar] [CrossRef]
- Dai, L.; Fang, R.; Li, H.; Hou, X.; Sheng, B.; Wu, Q.; Jia, W. Clinical Report Guided Retinal Microaneurysm Detection with Multi-Sieving Deep Learning. IEEE Trans. Med. Imaging 2018, 37, 1149–1161. [Google Scholar] [CrossRef]
- Guerrero, J.I.; Monedero, I.; Biscarri, F.; Biscarri, J.; Millán, R.; León, C. Non-Technical Losses Reduction by Improving the Inspections Accuracy in a Power Utility. IEEE Trans. Power Syst. 2018, 33, 1209–1218. [Google Scholar] [CrossRef]
- Jiao, Q.; Zhang, S. A Brief Survey of Word Embedding and Its Recent Development. In Proceedings of the IAEAC 2021—IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference, Chongqing, China, 12–14 March 2021; pp. 1697–1701. [Google Scholar] [CrossRef]
- Mikolov, T.; Chen, K.; Corrado, G.; Dean, J. Efficient Estimation of Word Representations in Vector Space. In Proceedings of the 1st International Conference on Learning Representations ICLR, Scottsdale, AZ, USA, 2–4 May 2013; pp. 1–12. [Google Scholar]
- Pennington, J.; Socher, R.; Manning, C. GloVe: Global Vectors for Word Representation. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing EMNLP, Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar]
- Peters, M.E.; Neumann, M.; Iyyer, M.; Gardner, M.; Clark, C.; Lee, K.; Zettlemoyer, L. Deep Contextualized Word Representations. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics, New Orleans, LA, USA, 1–6 June 2018; pp. 2227–2237. [Google Scholar]
- Radford, A.; Narasimhan, K. Improving Language Understanding by Generative Pre-Training. 2018. Available online: https://s3-us-west-2.amazonaws.com/openai-assets/research-covers/language-unsupervised/language_understanding_paper.pdf (accessed on 13 December 2022).
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef] [Green Version]
- Johnson, R.; Zhang, T. Deep Pyramid Convolutional Neural Networks for Text Categorization. In Proceedings of the ACL 2017—55th Annual Meeting of the Association for Computational Linguistics, Vancouver, BC, Canada, 30 July–4 August 2017; Volume 1, pp. 562–570. [Google Scholar] [CrossRef] [Green Version]
- Li, Y.; Algarni, A.; Albathan, M.; Shen, Y.; Bijaksana, M.A. Relevance Feature Discovery for Text Mining. IEEE Trans. Knowl. Data Eng. 2015, 27, 1656–1669. [Google Scholar] [CrossRef]
- Yu, S.; Su, J.; Luo, D. Improving BERT-Based Text Classification with Auxiliary Sentence and Domain Knowledge. IEEE Access 2019, 7, 176600–176612. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention Is All You Need. In Proceedings of the Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5999–6009. [Google Scholar]
- Rogers, D.; Preece, A.; Innes, M.; Spasić, I. Real-Time Text Classification of User-Generated Content on Social Media: Systematic Review. IEEE Trans. Comput. Soc. Syst. 2022, 9, 1154–1166. [Google Scholar] [CrossRef]
- Isa, D.; Lee, L.H.; Kallimani, V.P.; Rajkumar, R. Text Document Preprocessing with the Bayes Formula for Classification Using the Support Vector Machine. IEEE Trans. Knowl. Data Eng. 2008, 20, 1264–1272. [Google Scholar] [CrossRef]
- Kolchinsky, A.; Abi-Haidar, A.; Kaur, J.; Hamed, A.A.; Rocha, L.M. Classification of Protein-Protein Interaction Full-Text Documents Using Text and Citation Network Features. IEEE/ACM Trans. Comput. Biol. Bioinform. 2010, 7, 400–411. [Google Scholar] [CrossRef] [PubMed]
- Joulin, A.; Grave, E.; Bojanowski, P.; Mikolov, T. Bag of Tricks for Efficient Text Classification. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, 3–7 April 2017; pp. 427–431. [Google Scholar]
Dataset Name | Classes of Defect Level | Training Set | Verification Set | Test Set |
---|---|---|---|---|
Catenary Defect Text | 2 | 7611 | 1652 | 1653 |
Hyperparameter | Setting | Hyperparameter | Setting |
---|---|---|---|
Learning rate | 0.00005 | Padding size | 32 |
Optimizer | Adam | Embedding | 768 |
Batch size | 128 | Epoch | 20 |
Acc/% | Training Time | ||
---|---|---|---|
Two equal-width convolution layers | 120 | 96.73 | 17 min 25 s |
130 | 96.49 | 17 min 15 s | |
140 | 96.49 | 17 min 12 s | |
150 | 96.67 | 17 min 19 s | |
160 | 97.10 | 17 min 19 s | |
Three equal-width convolution layers | 120 | 96.61 | 17 min 13 s |
130 | 97.04 | 17 min 46 s | |
140 | 97.34 | 17 min 26 s | |
150 | 97.22 | 17 min 54 s | |
160 | 97.22 | 18 min 31 s | |
Four equal-width convolution layers | 120 | 96.67 | 18 min 01 s |
130 | 97.22 | 17 min 29 s | |
140 | 97.40 | 17 min 29 s | |
150 | 97.16 | 17 min 16 s | |
160 | 96.43 | 17 min 18 s |
Layer | Input Size | Kernel Size | Stride | Output Size | Number | |
---|---|---|---|---|---|---|
Input | BERT | — | — | — | 1 | |
Embedding layer | Conv2 | 1 | 1 | |||
Equal-width convolution layers | Padding1_1 | — | — | 1 | ||
Conv2_1 | 1 | |||||
Padding1_2 | — | — | ||||
Conv2_2 | 1 | |||||
Padding1_3 | — | — | ||||
Conv2_3 | 1 | |||||
Padding1_4 | — | — | ||||
Conv2_4 | 1 | |||||
Convolution block | Padding2 | — | — | 4 | ||
Max-pooling | 2 | |||||
Padding1_1 | — | — | ||||
Conv2_1 | 1 | |||||
Padding1_2 | — | — | ||||
Conv2_2 | 1 | |||||
Padding1_3 | — | — | ||||
Conv2_3 | 1 | |||||
Padding1_4 | — | — | ||||
Conv2_4 | 1 | |||||
Output | Fully connected layer | 140 | — | — | 2 | 1 |
Model | Severity Level 1 (827) | Severity Level 2 (826) | Macro Average | Acc/% | ||||||
---|---|---|---|---|---|---|---|---|---|---|
P/% | R/% | /% | P/% | R/% | /% | P/% | R/% | /% | ||
DTCN | 96.75 | 97.22 | 96.98 | 97.20 | 96.73 | 96.97 | 96.98 | 96.98 | 96.98 | 96.98 |
BERT-DTCN | 97.23 | 97.58 | 97.40 | 97.57 | 97.22 | 97.39 | 97.40 | 97.40 | 97.40 | 97.40 |
Model | Severity Level 1 (827) | Severity Level 2 (826) | Macro Average | Acc/% | ||||||
---|---|---|---|---|---|---|---|---|---|---|
P/% | R/% | /% | P/% | R/% | /% | P/% | R/% | /% | ||
CNN | 96.01 | 96.01 | 96.01 | 96.00 | 96.00 | 96.00 | 96.01 | 96.01 | 96.01 | 96.01 |
RNN | 94.94 | 92.29 | 93.95 | 93.12 | 95.04 | 94.07 | 94.03 | 94.01 | 94.01 | 94.01 |
RCNN | 96.70 | 95.53 | 96.11 | 95.57 | 96.73 | 96.15 | 96.13 | 96.13 | 96.13 | 96.13 |
Atti-Bi-LSTM | 95.81 | 94.07 | 94.94 | 94.17 | 95.88 | 95.02 | 94.99 | 94.98 | 94.98 | 94.98 |
FastText | 93.41 | 92.50 | 92.95 | 92.57 | 93.46 | 93.01 | 92.99 | 92.98 | 92.98 | 92.98 |
Transformer | 94.04 | 91.66 | 92.84 | 91.85 | 94.19 | 93.01 | 92.95 | 92.92 | 92.92 | 92.92 |
BERT-DTCN | 97.23 | 97.58 | 97.40 | 97.57 | 97.22 | 97.39 | 97.40 | 97.40 | 97.40 | 97.40 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, J.; Gao, S.; Yu, L.; Zhang, D.; Kou, L. Defect Severity Identification for a Catenary System Based on Deep Semantic Learning. Sensors 2022, 22, 9922. https://doi.org/10.3390/s22249922
Wang J, Gao S, Yu L, Zhang D, Kou L. Defect Severity Identification for a Catenary System Based on Deep Semantic Learning. Sensors. 2022; 22(24):9922. https://doi.org/10.3390/s22249922
Chicago/Turabian StyleWang, Jian, Shibin Gao, Long Yu, Dongkai Zhang, and Lei Kou. 2022. "Defect Severity Identification for a Catenary System Based on Deep Semantic Learning" Sensors 22, no. 24: 9922. https://doi.org/10.3390/s22249922
APA StyleWang, J., Gao, S., Yu, L., Zhang, D., & Kou, L. (2022). Defect Severity Identification for a Catenary System Based on Deep Semantic Learning. Sensors, 22(24), 9922. https://doi.org/10.3390/s22249922