Natural Language Processing (NLP) in Aviation Safety: Systematic Review of Research and Outlook into the Future
Abstract
:1. Introduction
- What is the performance of NLP applications on aviation safety-related subdomains?
- What are the challenges and limitations of these NLP applications?
2. Materials and Methods
2.1. Study Selection
- At least one NLP technology is applied;
- At least one sub-domain of aviation is related;
- Study must be related to safety;
- Study must be published in a peer-reviewed journal.
2.2. Reported Factors
- Objective;
- The target database and language;
- Sample size;
- Model(s), including the NLP model(s) and any additional model(s);
- Performance of model(s).
3. Results
3.1. NLP Applications on Incident/Accident Reports
3.1.1. NLP Models
3.1.2. Latent Factor Reasoning and Labeling
3.1.3. Performance Comparison Based on Application Scenarios
3.2. NLP Applications on ATC
3.2.1. Automatic Speech Recognition
3.2.2. Operational Information Extraction
4. Discussion
4.1. Challenges and Limitations
4.1.1. Ambiguity and Context
4.1.2. Multilingual Support
4.1.3. Noise and Background Sounds
4.1.4. Limited Training Data
- It is usually considered an incomprehensive report regarding the whole process of an incident/accident [12] and is considered less formal than the NTSB reports, including official investigation results;
- Objectiveness is hindered due to the nature (anonymity and confidentiality) of the reporting procedure [15].
4.1.5. Safety-Critical Systems
4.1.6. Real-Time Processing
4.1.7. Cost
4.2. Future Opportunities
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Rose, R.L.; Puranik, T.G.; Mavris, D.N. Natural Language Processing Based Method for Clustering and Analysis of Aviation Safety Narratives. Aerospace 2020, 7, 143. [Google Scholar] [CrossRef]
- Rose, R.L.; Puranik, T.G.; Mavris, D.N.; Rao, A.H. Application of structural topic modeling to aviation safety data. Reliab. Eng. Syst. Saf. 2022, 224, 108522. [Google Scholar] [CrossRef]
- Zhang, S.Y.; Kong, J.G.; Chen, C.; Li, Y.B.; Liang, H.J. Speech GAU: A single head attention for mandarin speech recognition for air traffic control. Aerospace 2022, 9, 395. [Google Scholar] [CrossRef]
- Xu, X.; Liu, W.; Gursoy, D. The impacts of service failure and recovery efforts on airline customers’ emotions and satisfaction. J. Travel Res. 2019, 58, 1034–1051. [Google Scholar] [CrossRef]
- Falessi, D.; Cantone, G.; Canfora, G. Empirical principles and an industrial case study in retrieving equivalent requirements via natural language processing techniques. IEEE Trans. Softw. Eng. 2011, 39, 18–44. [Google Scholar] [CrossRef]
- Amin, N.; Yother, T.; Johnson, M.; Rayz, J. Exploration of Natural Language Processing (NLP) Applications in Aviation. Coll. Aviat. Rev. Int. 2022, 40, 203–216. [Google Scholar]
- Dong, T.; Yang, Q.; Ebadi, N.; Luo, X.R.; Rad, P. Identifying incident causal factors to improve aviation transportation safety: Proposing a deep learning approach. J. Adv. Transp. 2021, 2021, 5540046. [Google Scholar] [CrossRef]
- Badrinath, S.; Balakrishnan, H. Automatic Speech Recognition for Air Traffic Control Communications. Transp. Res. Rec. 2022, 2676, 798–810. [Google Scholar] [CrossRef]
- Jiao, Y.; Dong, J.; Han, J.; Sun, H. Classification and causes identification of Chinese civil aviation incident reports. Appl. Sci. 2022, 12, 10765. [Google Scholar] [CrossRef]
- Miyamoto, A.; Bendarkar, M.V.; Mavris, D.N. Natural Language Processing of Aviation Safety Reports to Identify Inefficient Operational Patterns. Aerospace 2022, 9, 450. [Google Scholar] [CrossRef]
- Kuhn, K.D. Using structural topic modeling to identify latent topics and trends in aviation incident reports. Transp. Res. Part C Emerg. Technol. 2018, 87, 105–122. [Google Scholar] [CrossRef]
- Zhang, X.G.; Mahadevan, S. Ensemble machine learning models for aviation incident risk prediction. Decis. Support Syst. 2019, 116, 48–63. [Google Scholar] [CrossRef]
- Shi, D.H.; Guan, J.; Zurada, J.; Manikas, A. A data-mining approach to identification of risk factors in safety management systems. J. Manag. Inf. Syst. 2017, 34, 1054–1081. [Google Scholar] [CrossRef]
- Zhang, X.G.; Srinivasan, P.; Mahadevan, S. Sequential deep learning from NTSB reports for aviation safety prognosis. Saf. Sci. 2021, 142, 105390. [Google Scholar] [CrossRef]
- Andrzejczak, C.; Karwowski, W.; Thompson, W. The identification of factors contributing to self-reported anomalies in civil aviation. Int. J. Occup. Saf. Ergon. 2014, 20, 3–18. [Google Scholar] [CrossRef] [Green Version]
- Jia, G.M.; Lu, Y.J.; Lu, W.B.; Shi, Y.H.; Yang, J.F. Verification method for Chinese aviation radiotelephony readbacks based on LSTM-RNN. Electron. Lett. 2017, 53, 401–403. [Google Scholar] [CrossRef]
- Lin, Y.; Deng, L.J.; Chen, Z.M.; Wu, X.P.; Zhang, J.W.; Yang, B. A real-time ATC safety monitoring framework using a deep learning approach. IEEE Trans. Intell. Transp. Syst. 2019, 21, 4572–4581. [Google Scholar] [CrossRef]
- Koteeswaran, S.; Malarvizhi, N.; Kannan, E.; Sasikala, S.; Geetha, S. Data mining application on aviation accident data for predicting topmost causes for accidents. Clust. Comput. 2017, 22, 11379–11399. [Google Scholar] [CrossRef]
- Madeira, T.; Melício, R.; Valério, D.; Santos, L. Machine learning and natural language processing for prediction of human factors in aviation incident reports. Aerospace 2021, 8, 47. [Google Scholar] [CrossRef]
- Tanguy, L.; Tulechki, N.; Urieli, A.; Hermann, E.; Raynal, C. Natural language processing for aviation safety reports: From classification to interactive analysis. Comput. Ind. 2016, 78, 80–95. [Google Scholar] [CrossRef] [Green Version]
- Carvalho, T. Natural Language Processing in Airline Maintenance Operations. In Proceedings of the Presented at Aerospace IT 2022, Chicago, IL, USA, 5 October 2022. [Google Scholar]
- Irwin, W.J.; Robinson, S.D.; Belt, S.M. Visualization of Large-Scale Narrative Data Describing Human Error. Hum. Factors J. Hum. Factors Ergon. Soc. 2017, 59, 520–534. [Google Scholar] [CrossRef] [PubMed]
- Robinson, S.D. Temporal topic modeling applied to aviation safety reports: A subject matter expert review. Saf. Sci. 2019, 116, 275–286. [Google Scholar] [CrossRef]
- OpenAI. Available online: https://openai.com/ (accessed on 14 March 2023).
- Groff, L. Applying Natural Language Processing Tools to Occurrence Reports. ICAO. Available online: https://www.icao.int/safety/iStars/Documents/IUG%20Meeting%201/Presentations/Applying%20Natural%20Language%20Processing%20Tools%20to%20Occurrence%20Reports%20-%20Loren%20Groff.pdf (accessed on 7 April 2023).
- ICAO. Available online: https://www.icao.int/safety/Pages/Artificial-Intelligence-(AI).aspx (accessed on 7 April 2023).
- Kopald, H. Automatic Speech Recognition and Understanding of ATC Voice Communications. In Proceedings of the Air Transportation Information Exchange Conference (ATIEC) 2021, Virtual Event, 16 September 2021. [Google Scholar]
- NTSB. Available online: https://www.ntsb.gov/safety/safety-studies/Documents/SRR2201.pdf (accessed on 12 February 2023).
- Page, M.J.; Moher, D.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. PRISMA 2020 explanation and elaboration: Updated guidance and exemplars for reporting systematic reviews. BMJ 2021, 372, n160. [Google Scholar] [CrossRef] [PubMed]
- Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. Int. J. Surg. 2021, 88, 105906. [Google Scholar] [CrossRef]
- Pons, E.; Braun, L.M.M.; Hunink, M.G.M.; Kors, J.A. Natural Language Processing in Radiology: A Systematic Review. Radiology 2016, 279, 329–343. [Google Scholar] [CrossRef] [Green Version]
- Kreimeyer, K.; Foster, M.; Pandey, A.; Arya, N.; Halford, G.; Jones, S.F.; Forshee, R.; Walderhaug, M.; Botsis, T. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review. J. Biomed. Inform. 2017, 73, 14–29. [Google Scholar] [CrossRef]
- Dreisbach, C.; Koleck, T.A.; Bourne, P.E.; Bakken, S. A systematic review of natural language processing and text mining of symptoms from electronic patient-authored text data. Int. J. Med. Inform. 2019, 125, 37–46. [Google Scholar] [CrossRef]
- Ginieis, M.; Sánchez-Rebull, M.V.; Campa-Planas, F. The academic journal literature on air transport: Analysis using systematic literature review methodology. J. Air Transp. Manag. 2012, 19, 31–35. [Google Scholar] [CrossRef]
- Abedin, M.A.U.; Ng, V.; Khan, L. Cause identification from aviation safety incident reports via weakly supervised semantic lexicon construction. J. Artif. Intell. Res. 2010, 38, 569–631. [Google Scholar] [CrossRef]
- Ahadh, A.; Binish, G.V.; Srinivasan, R. Text mining of accident reports using semi-supervised keyword extraction and topic modeling. Process. Saf. Environ. Prot. 2021, 155, 455–465. [Google Scholar] [CrossRef]
- Perboli, G.; Gajetti, M.; Fedorov, S.; Giudice, S.L. Natural Language Processing for the identification of Human factors in aviation accidents causes: An application to the SHEL methodology. Expert Syst. Appl. 2021, 186, 115694. [Google Scholar] [CrossRef]
- Andrzejczak, C.; Karwowski, W.; Mikusinski, P. Application of diffusion maps to identify human factors of self-reported anomalies in aviation. Work 2012, 41, 188–197. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Robinson, S.D.; Irwin, W.J.; Kelly, T.K.; Wu, X.O. Application of machine learning to mapping primary causal factors in self-reported safety narratives. Saf. Sci. 2015, 75, 118–129. [Google Scholar] [CrossRef] [Green Version]
- Wiegmann, D.A.; Shappell, S.A. Human error analysis of commercial aviation accidents: Application of the human factors analysis and classification system (HFACS). Aviat. Space Envion. Med. 2001, 72, 1006–1016. [Google Scholar]
- Lin, Y.; Guo, D.Y.; Zhang, J.W.; Chen, Z.M.; Yang, B. A unified framework for multilingual speech recognition in air traffic control systems. IEEE Trans. Neural Netw. Learn. Syst. 2021, 32, 3608–3620. [Google Scholar] [CrossRef] [PubMed]
- Sun, Z.; Tang, P. Automatic communication error detection using speech recognition and linguistic analysis for proactive control of loss of separation. Transp. Res. Rec. J. Transp. Res. Board 2021, 2675, 1–12. [Google Scholar] [CrossRef]
- Wang, X.; Mao, Y.; Wu, X.Y.; Xu, Q.C.; Jiang, W.Y.; Yin, S.W. An ATC instruction processing-based trajectory prediction algorithm designing. Neural Comput. Appl. 2021, 1–14. [Google Scholar] [CrossRef]
- Lin, Y.; Tan, X.; Yang, B.; Yang, K.; Zhang, J.; Yu, J. Real-time controlling dynamics sensing in air traffic system. Sensors 2019, 19, 679. [Google Scholar] [CrossRef] [Green Version]
- Vukovic, M.; Stolar, M.; Lech, M. Cognitive Load Estimation From Speech Commands to Simulated Aircraft. IEEE/ACM Trans. Audio Speech Lang. Process. 2021, 29, 1011–1022. [Google Scholar] [CrossRef]
- Tan, L.; Yu, K.; Lin, L.; Cheng, X.; Srivastava, G.; Lin, J.C.-W.; Wei, W. Speech Emotion Recognition Enhanced Traffic Efficiency Solution for Autonomous Vehicles in a 5G-Enabled Space–Air–Ground Integrated Intelligent Transportation System. IEEE Trans. Intell. Transp. Syst. 2021, 23, 2830–2842. [Google Scholar] [CrossRef]
- Biadsy, F. Automatic Dialect and Accent Recognition and its Application to Speech Recognition. Ph.D. Thesis, Columbia University, New York, NY, USA, 2011. [Google Scholar]
- Haffner, P.; Tur, G.; Wright, J.H. Optimizing SVMs for complex call classification. In Proceedings of the 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, (ICASSP ’03), Hong Kong, China, 6–10 April 2003; pp. I-632–I-635. [Google Scholar]
- Yao, K.; Peng, B.; Zweig, G.; Yu, D.; Li, X.; Gao, F. Recurrent conditional random field for language understanding. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Florence, Italy, 4–9 May 2014; pp. 4077–4081. [Google Scholar]
- Bonnisseau, J.-M.; Lachiri, O. On the objective of firms under uncertainty with stock markets. J. Math. Econ. 2004, 40, 493–513. [Google Scholar] [CrossRef] [Green Version]
- Cordoba, R.D.; Ferreiros, J.; San-Segundo, R.; Macias-Guarasa, J.; Montero, J.M.; Fernandez, F.; D’Haro, L.F.; Pardo, J.M. Air traffic control speech recognition system cross-task and speaker adaptation. IEEE Aerosp. Electron. Syst. Mag. 2006, 21, 12–17. [Google Scholar] [CrossRef]
- Yao, K.; Peng, B.; Zhang, Y.; Yu, D.; Zweig, G.; Shi, Y. Spoken language understanding using long short-term memory neural networks. In Proceedings of the 2014 IEEE Spoken Language Technology Workshop (SLT), South Lake Tahoe, NV, USA, 7–10 December 2014. [Google Scholar]
- Xu, P.; Sarikaya, R. Convolutional neural network based triangular CRF for joint intent detection and slot filling. In Proceedings of the 2013 IEEE Workshop on Automatic Speech Recognition and Understanding, Olomouc, Czech Republic, 8–12 December 2013; pp. 78–83. [Google Scholar]
- Guo, D.; Tur, G.; Yih, W.; Zweig, G. Joint semantic utterance classification and slot filling with recursive neural networks. In Proceedings of the 2014 IEEE Spoken Language Technology Workshop (SLT), South Lake Tahoe, NV, USA, 7–10 December 2014; pp. 554–559. [Google Scholar]
- Zhou, K.; Yang, Q.; Sun, X.S.; Liu, S.H.; Lu, J.J. Improved CTC-Attention Based End-to-End Speech Recognition on Air Traffic Control. In Proceedings of the 9th International Conference on Intelligence Science and Big Data Engineering (IScIDE), Nanjing, China, 17–20 October 2019. [Google Scholar]
- Wang, J.; Liu, S.H.; Yang, Q. Transfer learning for air traffic control LVCSR system. In Proceedings of the 2017 Second International Conference on Mechanical, Control and Computer Engineering (ICMCCE), Harbin, China, 10 December 2017. [Google Scholar]
- Lin, Y.; Li, Q.; Yang, B. Improving speech recognition models with small samples for air traffic control systems. Neurocomputing 2021, 445, 287–297. [Google Scholar] [CrossRef]
- Srinivasamurthy, A.; Motlicek, P.; Himawan, I.; Szaszák, G.; Oualil, Y.; Helmke, H. Semi-supervised learning with semantic knowledge extraction for improved speech recognition in air traffic control. In Proceedings of the Interspeech 2017, Stockholm, Sweden, 20 August 2017. [Google Scholar]
- Oualil, Y.; Klakow, D.; Szasza’k, G.; Srinivasamurthy, A.; Helmke, H.; Motlicek, P. A context-aware speech recognition and understanding system for air traffic control domain. In Proceedings of the 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), Okinawa, Japan, 16–20 December 2017; pp. 404–408. [Google Scholar]
- Nguyen, V.N. Using Linguistic Knowledge for Improving Automatic Speech Recognition Accuracy in Air Traffic Control. Master’s Thesis, Østfold University College, Halden, Norway, 2016. [Google Scholar]
- Kopald, H.D.; Chanen, A.; Chen, S.; Smith, E.C.; Tarakan, R.M. Applying automatic speech recognition technology to Air Traffic Management. In Proceedings of the 2013 IEEE/AIAA 32nd Digital Avionics Systems Conference (DASC), East Syracuse, NY, USA, 5–10 October 2013. [Google Scholar]
- Xiao, J.; Chennakesavan, A.; Chandra, C.; Bendarkar, M.V.; Kirby, M.; Mavris, D.N. BERT for aviation text classification. In Proceedings of the AIAA Aviation 2023 Forum, San Diego, CA, USA, 12–16 June 2023. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv 2018, arXiv:1810.04805. [Google Scholar]
- Kierszbaum, S.; Lapasset, L. Applying distilled BERT for question answering on ASRS reports. In Proceedings of the 2020 IEEE New Trends in Civil Aviation (NTCA), Prague, Czech Republic, 23–24 November 2020; pp. 33–38. [Google Scholar]
- Andrade, S.R.; Walsh, H.S. SafeAeroBERT: Towards a safety-informed aerospace-specific language model. In Proceedings of the AIAA AVIATION 2023 Forum, San Diego, CA, USA, 12–16 June 2023. [Google Scholar]
- Chandra, C.; Jing, X.; Bendarkar, M.V.; Sawant, K.; Elias, L.; Kirby, M.; Mavris, D.N. Aviation-BERT: A preliminary aviation-specific natural language model. In Proceedings of the AIAA AVIATION 2023 Forum, San Diego, CA, USA, 12–16 June 2023. [Google Scholar]
- Tikayat Ray, A.; Cole, B.F.; Pinon Fischer, O.J.; White, R.T.; Mavris, D.N. aeroBERT-Classifier: Classification of Aerospace Requirements Using BERT. Aerospace 2023, 10, 279. [Google Scholar] [CrossRef]
- Maynard, P.; Clarke, S.S.; Almache, J.; Kumar, S.; Rajkumar, S.; Kemp, A.; Pai, R. Natural Language Processing (NLP) Techniques for Air Traffic Management Planning. In Proceedings of the AIAA Aviation 2021 Forum, Virtual Event, 2–6 August 2021. [Google Scholar]
Acronym | Full Name | Acronym | Full Name |
---|---|---|---|
ADS-B | Automatic Dependent Surveillance-Broadcast | LDA | Latent Dirichlet Allocation |
AM | Acoustic model | LM | Language model |
ANN | Artificial neural networks | LoS | Losses of separation |
ASR | Automatic speech recognition | LSA | Latent semantic analysis |
ASRS | Aviation Safety Reporting System | LSTM | Long short-term memory |
ATC | Air traffic control | MCNN | Multiscale CNN |
BERT | Bidirectional Encoder Representations | MLP | Multilayer perceptron |
BLSTM | Bidirectional long short-term memory | NASA | National Aeronautics and Space Administration |
CAAC | Civil Aviation Administration of China | NB | Naïve Bayes |
CER | Character error rate | NER | Name entity recognition |
CFR | Code of Federal Regulations | NTSB | National Transportation Safety Board |
CNN | Convolutional neural networks | OC-POS | Occurrence position |
CRF | Conditional random field | PCA | Principle component analysis |
CTC | Connectionist temporal classification | PM | Pronunciation model |
DGAC | Directorate General for Civil Aviation | ResNet | Residual network |
EASA | European Union Aviation Safety Agency | RNN | Recurrent neural network |
FAA | Federal Aviation Administration | RTF | Real-time factor |
FC | Fully connected layers | SMEs | Subject matter experts |
GAU | Gated attention unit | SRL | Semantic role labeling |
GMM | Gaussian mixture model | STM | Structural topic modeling |
HFACS | Human factors analysis and classification system | SVD | Singular vector decomposition |
HMI | Human–machine interface | SVM | Support vector machine |
HMM | Hidden Markov models | TF-IDF | Term frequency and inverse document frequency |
IATA | International Air Transport Association | t-SNE | T-distributed stochastic neighbor embedding |
ICAO | International Civil Aviation Organization | UAS | Unmanned aerial system |
k-NN | K-nearest neighbors algorithm | WER | Word error rate |
LAN | Label attention network |
Natural Language Processing Techniques | Aviation (and Its Sub-Domains) | ||
---|---|---|---|
Variable | Acronym | Variable | Acronym |
Natural language processing | NLP | Air transportation | - |
Text mining | - | Air transport | - |
Text classification | - | Air traffic control | ATC |
Latent semantic analysis | LSA | Aerospace | - |
- | Airport | - | |
- | Airline | - | |
Airplane | - | ||
Aircraft | - |
Authors, Year | Objective(s) | Data Source | Sample Size | Language |
---|---|---|---|---|
Abedin et al., 2010 [35] | Identify the potential causes of aviation incidents. | Aviation Safety Reporting System (ASRS) | 1333 | English |
Shi et al., 2018 [13] | Identify risk factors in safety management systems. | ASRS | 168,227 | English |
Andrzejczak et al., 2014 [15] | Identify human factors contributing to anomalies. | ASRS | 127,776 | English |
Ahadh et al., 2021 [36] | Identify the stage of flight when an aviation accident occurs. | ASRS | 37,681 | English |
Zhang and Mahadevan, 2019 [12] | Quantify the risk relating to the consequences of hazardous events for aviation incident risk prediction. | ASRS | 64,573 | English |
Perboli et al., 2021 [37] | Identify human factors in the causes of aviation accidents. | Deloitte experts’ reports | 24 | English |
Jiao et al., 2022 [9] | Identify and classify causes in Chinese civil aviation incident reports. | Chinese accident reports | 20,000 | Chinese |
Robinson, 2019 [23] | Identify the temporal trends of factors affecting safety in commercial airline operations. | ASRS | 64,776 | English |
Tanguy et al., 2016 [20] | Identify tendencies of abnormality during a civil air flight. | ASRS and French DGAC *’s database | 136,861 | English and French |
Dong et al., 2021 [7] | Identify the primary factor and multiple contributing factors of each incident from six most causal factors. | ASRS | 181,651 | English |
Kuhn, 2018 [11] | Identify latent topics and trends in incident reports. | ASRS | 01/2010 to 04/2015 | English |
Zhang et al., 2021 [14] | Automate the prognosis of aviation safety accidents. | NTSB | 1673 | English |
Andrzejczak et al., 2012 [38] | Identify human factors of self-reported anomalies. | ASRS | Not indicated | English |
Miyamoto et al., 2022 [10] | Identify inefficient operational patterns that cause flight delays and cancellations (from a safety perspective). | ASRS | 4195 | English |
Robinson et al., 2015 [39] | Map primary causal factors in self-reported safety narratives. | ASRS | 4497 | English |
Irwin et al., 2017 [22] | Visualize human errors for detailed analysis of text-based narratives. | ASRS | 4547 | English |
Rose et al., 2022 [2] | Identify themes within technical datasets. | ASRS and NTSB | 13,336 (ASRS) and 386 (NTSB) | English |
Koteeswaran et al., 2019 [18] | Predict the topmost causes from an aircraft accident database. | Aviation Accident Dataset (AAD) | 1379 | English |
Rose et al., 2020 [1] | Extract underlying trends from narratives. | ASRS | 13,336 | English |
Madeira et al., 2021 [19] | Identify and classify human factors from aviation incident reports. | ASN database | 1674 | English |
AUTHORS, YEAR | Models | Evaluation | |
---|---|---|---|
NLP Model(s) | Reasoning Model(s) | ||
Abedin et al., 2010 [35] | Weakly supervised lexicon learning with SVMs | Not Applicable |
|
Shi et al., 2018 [13] | Latent semantic analysis with NB, VFDT, and OBA | Not Applicable | OBA yields the best performance in all four scenarios with mean accuracies of 76.5%, 76.8%, 77.0% (human factor classifier), and 88.3%, 87.0%, 88.45, and 88.55 (aircraft classifier), respectively. |
Andrzejczak et al., 2014 [15] | IBM SPSS Modeler 13: Text Analytics | HFACS | This method reveals the relationship between human factors and reported anomalies. |
Ahadh et al., 2021 [36] | GuideLDA | Not Applicable | The weighted average accuracy is 77%. |
Zhang and Mahadevan, 2019 [12] | A hybrid SVM and DNN model | A risk-based event outcome categorization | The hybrid model yields better performance in precision, with an average score of 0.81, which is 3% higher than the SVM and 6% higher than DNN. |
Perboli et al., 2021 [37] | Word2vec and Doc2vec | SHEL | TFw2v_model has the best performance with a total precision of 88.89%. |
Jiao et al., 2022 [9] | TF-IDF, Word2vec, and OC-POS withLR, L-SVM, KNN, DT, NB, SVM, RF, AdaBoost, GBoost, and XGBoost | A rule-based system to identify the related factors | XGBoost classifier and OC-POS methods have the best performance, where F1-score is above 0.90 when identifying 25 causes from the target dataset. |
Robinson, 2019 [23] | LDA | Subject matter experts (SMEs) | All three SMEs were able to identify a cohesive theme from each topic. |
Tanguy et al., 2016 [20] | LDA with SVMs | Not Applicable | Result: 85.96% precision for ten iterations in the DGAC corpus and 46.49% in the ASRS corpus. |
Dong et al., 2022 [7] | Averaged Stochastic Gradient Descent Weight-Dropped (AWD) LSTM | Not Applicable | The proposed model yields an average accuracy of 82% on the six common factors and about 89% on the two most common factors on average. |
Kuhn, 2018 [11] | LDA with STM | Not Applicable | The results need to be verified by SMEs. |
Zhang et al., 2021 [14] | LSTM | Damage and injury level | The accident vs. incident model has an accuracy of 73% on validation data, while the sensitivity and specificity of the trained model are 75% and 72.14%, respectively. |
Andrzejczak et al., 2012 [38] | Diffusion Maps (DM) | Not Applicable | The proposed model yields an average accuracy of 82% on the six common factors and about 89% on the two most common factors on average. |
Miyamoto et al., 2022 [10] | BoW with TF-IDF | t-SNE and K-Means Clustering | The present work shows the ability to identify high-level causes and the circumstances in which delays occur. |
Robinson et al., 2015 [39] | LSA with SVD | Not Applicable | An unsupervised categorization accuracy of 44% for primary cause within the existing taxonomy based on a small sample. |
Irwin et al., 2017 [22] | LSA | Isometric Mapping and GIS | The present study confirms that the proposed approach is useful for reducing, interpreting, and organizing narrative data. |
Rose et al., 2022 [2] | LDA with STM | Not Applicable | This study demonstrates the feasibility of an STM-based approach for classifying aviation safety narratives. |
Koteeswaran et al., 2019 [18] | Improved oscillated correlation feature selection (IOCFS) withNB, SVM, ANN, k-NN, and J48 | Not Applicable | k-NN yields the best performance (accuracy of 99.03%), with the value of k = 5 |
Rose et al., 2020 [1] | BoW with TF-IDF | t-SNE and K-Means Clustering | The method identified 10 major clusters and 31 sub-clusters. |
Madeira et al., 2021 [19] | Word2Vec and Doc2Vec_Models with SVM and Bayesian optimization | HFACS | The best predictive models achieved a Micro F-score of 90%, 77.9%, and 87.5%. |
Authors, Year | Objectives | NLP Models | Data Sources | Sample Size | Language |
---|---|---|---|---|---|
Badrinath & Balakrishnan, 2022 [8] | ASR for ATC communication |
| Transcripts of ATC communications from the U.S. and Europe | 84 h of audio transcription | English |
Zhang et al., 2022 [3] | Mandarin speech recognition for ATC |
| The Aishell open-source Mandarin corpus and ATC voice recordings | 178 h of Aishell corpus and 67 h of ATC corpus | Chinese |
Lin et al., 2021 [41] | Multilingual speech recognition in ATC systems |
| Raw ATC speech recorded at Chengdu, Shanghai, and Kunming Airports in China | 1148 h of Chinese speech And 281 h of English speech | Chinese, English |
Sun & Tang, 2021 [42] | Automated ATC communication error detection to prevent loss of separation (LoS) |
| ATC communication from simulated approach control scenarios | 75 min simulation (234 clearances) | English |
Jia et al., 2017 [16] | Aviation radiotelephony readback verification |
| Experimental civil aviation radiotelephony corpus built from original ATC communication recordings and books for training | 800 pairs of instruction and readback | Chinese |
Wang et al., 2021 [43] | Trajectory prediction |
| The Mandarin-based 5000 control instructions | N/A | Chinese |
Lin et al., 2019 [44] | ATC ASR and CIU-based method to convert speech into ATC-related elements |
| Raw ATC speech from ZUUU in China | 578 h ATC speech for modeling training (481 h Chinese and 97 h English) | Chinese, English |
Lin et al., 2020 [17] | Automatic Speech Recognition as a component of the ATC safety monitoring system |
| ATC communication speech recorded at civil airports in China | 342 h of Chinese speech and 47 h of English speech | Chinese, English |
Vukoic et al., 2021 [45] | Cognitive load estimation from speech using spectral features |
| Recorded speech from human–machine interaction experiment | 4.8 h of speech | English |
Tan et al., 2022 [46] | Speech emotion recognition for autonomous vehicle |
| Interactive Emotional Dyadic Motion Capture (IEMOCAP) data set | N/A | English |
Authors, Year | ASR as a Primary Objective | Information Extraction | Models | Evaluation | |
---|---|---|---|---|---|
ASR | Information Extraction | ||||
Badrinath & Balakrishnan, 2022 [8] | × | Call sign and runway number |
|
|
|
Zhang et al., 2022 [3] | × |
| The proposed model’s character error rate (CER) was 11.1% on the expanded Aishell corpus and 8% on the ATC corpus. | ||
Lin et al., 2021 [41] | × |
| A 3.95% label error rate (LER) on Chinese characters and English words | ||
Sun & Tang, 2021 [42] | Communication features and communication errors |
|
| No evaluation of ASR; study findings indicate a high correlation between read-back errors and LoS. | |
Jia et al., 2017 [16] | Semantic characteristics of ATC instructions and pilot readback |
|
| The proposed semantic consistency verification scheme with K-nearest neighbors (k-NN) and random forest (RF) as classifiers is more stable and accurate (83.8% and 83%) | |
Wang et al., 2021 [43] | Semantic characteristics of ATC instruction | BiLSTM-LAN-CRF (a deep neural network-based algorithm) to extract the entities of ATC instruction | The percentage of wrong tags was used as metrics for performance evaluation; BiLSTM-LAN-CRF yields the best result over the other three models. | ||
Lin et al., 2019 [44] | × | Controlling intent and parameters |
| An RNN-based joint model for detecting the controlling intent and labeling the controlling parameters | A 4% WER with an average of 0.147 RTF was achieved. |
Lin et al., 2020 [17] | × | Repetition check, flight confirmation verification, and conflict detection |
|
| The proposed model decoding with the RNN-based language model yields the best result with a 5.07% and 5.99% WER for Chinese and English. |
Vukoic et al., 2021 [45] | Cognitive load |
| The method yields 83.7% accuracy with CNN classifiers, which outperformed SVM and k-NN by 13.2% and 10.5%, respectively. | ||
Tan et al., 2022 [46] | Speech emotion |
|
| The proposed method yields the best result over other methods, with a 74% weighted accuracy and 65.4% unweighted accuracy. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yang, C.; Huang, C. Natural Language Processing (NLP) in Aviation Safety: Systematic Review of Research and Outlook into the Future. Aerospace 2023, 10, 600. https://doi.org/10.3390/aerospace10070600
Yang C, Huang C. Natural Language Processing (NLP) in Aviation Safety: Systematic Review of Research and Outlook into the Future. Aerospace. 2023; 10(7):600. https://doi.org/10.3390/aerospace10070600
Chicago/Turabian StyleYang, Chuyang, and Chenyu Huang. 2023. "Natural Language Processing (NLP) in Aviation Safety: Systematic Review of Research and Outlook into the Future" Aerospace 10, no. 7: 600. https://doi.org/10.3390/aerospace10070600
APA StyleYang, C., & Huang, C. (2023). Natural Language Processing (NLP) in Aviation Safety: Systematic Review of Research and Outlook into the Future. Aerospace, 10(7), 600. https://doi.org/10.3390/aerospace10070600