Examining the Potential of Generative Language Models for Aviation Safety Analysis: Case Study and Insights Using the Aviation Safety Reporting System (ASRS)
Abstract
:1. Introduction
- Generation of succinct synopses of the incidents from incident narratives using ChatGPT.
- Comparison of the faithfulness of the generated synopses to human-written synopses.
- Identification of the human factors contributing to an incident.
- Identification of the entity involved in the incident.
- Providing explanatory logic/rationale for the generative language model’s decisions.
2. Background
2.1. Aviation Safety Reporting System (ASRS)
Form Name | Submitted by |
---|---|
General Report Form | Pilot, Dispatcher, Ground Ops, and Other |
ATC Report Form | Air Traffic Controller |
Maintenance Report Form | Repairman, Mechanic, and Inspector |
Cabin Report Form | Cabin Crew |
UAS Report Form | UAS Pilot, Visual Observer, and Crew |
2.2. Large Language Models (LLMs) as Foundation Models
2.2.1. Generative Language Models
- Supervised policy fine-tuning: Collect a set of instruction prompts and data labelers to demonstrate the desired output. This is used for supervised fine-tuning (SFT) of GPT-3.
- Training a reward model: Collect a set of instruction prompts, each with multiple different model outputs, and have data labelers rank the responses. This is used to train a reward model (RM) starting from the SFT model with the final layer removed.
- Optimizing a policy against the RM via RL: Collect a set of prompts, outputs, and corresponding rewards. This is used to fine-tune the SFT model on their environment using proximal policy optimization (PPO).
2.2.2. NLP in Aviation Safety Analysis
3. Materials and Methods
3.1. Dataset
3.2. Prompt Engineering for ASRS Analysis
3.3. Analyzing ChatGPT’s Performance
3.3.1. Similarity Analysis Using BERT-Based LM Embeddings
3.3.2. Manual Examination of Synopses
4. Results and Discussion
4.1. Generation of Incident Synopses
4.2. Performance with Human Factors-Related Issues
4.3. Assessment of Responsibility
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
AI | Artificial Intelligence |
ATC | Air Traffic Control |
ASRS | Aviation Safety Reporting System |
BERT | Bidirectional Encoder Representations from Transformers |
CSV | comma-separated values |
FAA | Federal Aviation Administration |
GPT | Generative Pre-trained Transformer |
JSON | JavaScript Object Notation |
LaMDA | Language Models for Dialog Applications |
LLaMA | Large Language Model Meta AI |
LLM | Large Language Model |
MLM | Masked Language Modeling |
MVA | Minimum Vectoring Altitude |
NAS | National Airspace System |
NASA | National Aeronautics and Space Administration |
NLP | Natural Language Processing |
NSP | Next-Sentence Prediction |
PaLM | Pathways Language Model |
PPO | Proximal Policy Optimization |
RL | reinforcement learning |
RLHF | reinforcement learning from human feedback |
RM | Reward Model |
SFT | Supervised Fine-Tuning |
UAS | Unmanned Aerial Systems |
Appendix A
Listing A1. Prompt used for this work. |
References
- ASRS Program Briefing PDF. Available online: https://asrs.arc.nasa.gov/docs/ASRS_ProgramBriefing.pdf (accessed on 16 May 2023).
- ASRS Program Briefing. Available online: https://asrs.arc.nasa.gov/overview/summary.html (accessed on 16 May 2023).
- Boesser, C.T. Comparing Human and Machine Learning Classification of Human Factors in Incident Reports from Aviation. 2020. Available online: https://stars.library.ucf.edu/cgi/viewcontent.cgi?article=1330&context=etd2020 (accessed on 16 May 2023).
- Andrade, S.R.; Walsh, H.S. SafeAeroBERT: Towards a Safety-Informed Aerospace-Specific Language Model. In AIAA AVIATION 2023 Forum; American Institute of Aeronautics and Astronautics (AIAA): San Diego, CA, USA, 2023. [Google Scholar] [CrossRef]
- Ouyang, L.; Wu, J.; Jiang, X.; Almeida, D.; Wainwright, C.; Mishkin, P.; Zhang, C.; Agarwal, S.; Slama, K.; Ray, A.; et al. Training language models to follow instructions with human feedback. Adv. Neural Inf. Process. Syst. 2022, 35, 27730–27744. [Google Scholar]
- Tikayat Ray, A.; Bhat, A.P.; White, R.T.; Nguyen, V.M.; Pinon Fischer, O.J.; Mavris, D.N. ASRS-ChatGPT Dataset. Available online: https://huggingface.co/datasets/archanatikayatray/ASRS-ChatGPT (accessed on 16 May 2023).
- Electronic Report Submission (ERS). Available online: https://asrs.arc.nasa.gov/report/electronic.html (accessed on 16 May 2023).
- General Form. Available online: https://akama.arc.nasa.gov/asrs_ers/general.html (accessed on 16 May 2023).
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Radford, A.; Narasimhan, K.; Salimans, T.; Sutskever, I. Improving Language Understanding by Generative Pre-Training. 2018. Available online: https://www.mikecaptain.com/resources/pdf/GPT-1.pdf (accessed on 16 May 2023).
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- OpenAI. GPT-4 Technical Report. arXiv 2023, arXiv:2303.08774. [Google Scholar]
- Touvron, H.; Lavril, T.; Izacard, G.; Martinet, X.; Lachaux, M.A.; Lacroix, T.; Rozière, B.; Goyal, N.; Hambro, E.; Azhar, F.; et al. Llama: Open and efficient foundation language models. arXiv 2023, arXiv:2302.13971. [Google Scholar]
- Touvron, H.; Martin, L.; Stone, K.; Albert, P.; Almahairi, A.; Babaei, Y.; Bashlykov, N.; Batra, S.; Bhargava, P.; Bhosale, S.; et al. Llama 2: Open Foundation and Fine-Tuned Chat Models. arXiv 2023, arXiv:2307.09288. [Google Scholar]
- Thoppilan, R.; De Freitas, D.; Hall, J.; Shazeer, N.; Kulshreshtha, A.; Cheng, H.T.; Jin, A.; Bos, T.; Baker, L.; Du, Y.; et al. Lamda: Language models for dialog applications. arXiv 2022, arXiv:2201.08239. [Google Scholar]
- Chowdhery, A.; Narang, S.; Devlin, J.; Bosma, M.; Mishra, G.; Roberts, A.; Barham, P.; Chung, H.W.; Sutton, C.; Gehrmann, S.; et al. Palm: Scaling language modeling with pathways. arXiv 2022, arXiv:2204.02311. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.u.; Polosukhin, I. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Hinton, G.E.; Osindero, S.; Teh, Y.W. A fast learning algorithm for deep belief nets. Neural Comput. 2006, 18, 1527–1554. [Google Scholar] [CrossRef]
- Tikayat Ray, A.; Pinon Fischer, O.J.; Mavris, D.N.; White, R.T.; Cole, B.F. aeroBERT-NER: Named-Entity Recognition for Aerospace Requirements Engineering using BERT. In AIAA SCITECH 2023 Forum; American Institute of Aeronautics and Astronautics (AIAA): National Harbor, MD, USA, 2023. [Google Scholar] [CrossRef]
- Bommasani, R.; Hudson, D.A.; Adeli, E.; Altman, R.; Arora, S.; von Arx, S.; Bernstein, M.S.; Bohg, J.; Bosselut, A.; Brunskill, E.; et al. On the Opportunities and Risks of Foundation Models. arXiv 2021, arXiv:2108.07258. [Google Scholar]
- Tikayat Ray, A.; Cole, B.F.; Pinon Fischer, O.J.; White, R.T.; Mavris, D.N. aeroBERT-Classifier: Classification of Aerospace Requirements Using BERT. Aerospace 2023, 10, 279. [Google Scholar] [CrossRef]
- Tikayat Ray, A.; Cole, B.F.; Pinon Fischer, O.J.; Bhat, A.P.; White, R.T.; Mavris, D.N. Agile Methodology for the Standardization of Engineering Requirements Using Large Language Models. Systems 2023, 11, 352. [Google Scholar] [CrossRef]
- Tikayat Ray, A. Standardization of Engineering Requirements Using Large Language Models. Ph.D. Thesis, Georgia Institute of Technology, Atlanta, GA, USA, 2023. [Google Scholar] [CrossRef]
- Weaver, W. Translation. In Machine Translation of Languages; Locke, W.N., Boothe, A.D., Eds.; MIT Press: Cambridge, MA, USA, 1952; pp. 15–23. Available online: https://aclanthology.org/1952.earlymt-1.1.pdf (accessed on 16 May 2023).
- Brown, P.F.; Cocke, J.; Della Pietra, S.A.; Della Pietra, V.J.; Jelinek, F.; Lafferty, J.; Mercer, R.L.; Roossin, P.S. A statistical approach to machine translation. Comput. Linguist. 1990, 16, 79–85. [Google Scholar]
- Bengio, Y.; Ducharme, R.; Vincent, P. A Neural Probabilistic Language Model. In Advances in Neural Information Processing Systems; Leen, T., Dietterich, T., Tresp, V., Eds.; MIT Press: Cambridge, MA, USA, 2000; Volume 13. [Google Scholar]
- Gehman, S.; Gururangan, S.; Sap, M.; Choi, Y.; Smith, N.A. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models. arXiv 2020, arXiv:2009.11462. [Google Scholar]
- Ziegler, D.M.; Stiennon, N.; Wu, J.; Brown, T.B.; Radford, A.; Amodei, D.; Christiano, P.; Irving, G. Fine-tuning language models from human preferences. arXiv 2019, arXiv:1909.08593. [Google Scholar]
- Stiennon, N.; Ouyang, L.; Wu, J.; Ziegler, D.; Lowe, R.; Voss, C.; Radford, A.; Amodei, D.; Christiano, P.F. Learning to summarize with human feedback. Adv. Neural Inf. Process. Syst. 2020, 33, 3008–3021. [Google Scholar]
- Graeber, C. The role of human factors in improving aviation safety. Aero Boeing 1999, 8. [Google Scholar]
- Santos, L.; Melicio, R. Stress, Pressure and Fatigue on Aircraft Maintenance Personal. Int. Rev. Aerosp. Eng. 2019, 12, 35–45. [Google Scholar] [CrossRef]
- Saleh, J.H.; Tikayat Ray, A.; Zhang, K.S.; Churchwell, J.S. Maintenance and inspection as risk factors in helicopter accidents: Analysis and recommendations. PLoS ONE 2019, 14, e0211424. Available online: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0211424 (accessed on 16 May 2023). [CrossRef] [PubMed]
- Dumitru, I.M.; Boşcoianu, M. Human factors contribution to aviation safety. Int. Sci. Comm. 2015, 49. Available online: https://www.afahc.ro/ro/afases/2015/afases_2015/air_force/dumitru_%20boscoianu.pdf (accessed on 16 May 2023).
- Hobbs, A. Human factors: The last frontier of aviation safety? Int. J. Aviat. Psychol. 2004, 14, 331–345. [Google Scholar] [CrossRef]
- Salas, E.; Maurino, D.; Curtis, M. Human factors in aviation: An overview. Hum. Factors Aviat. 2010, 3–19. [Google Scholar] [CrossRef]
- Cardosi, K.; Lennertz, T. Human Factors Considerations for the Integration of Unmanned Aerial Vehicles in the National Airspace System: An Analysis of Reports Submitted to the Aviation Safety Reporting System (ASRS). 2017. Available online: https://rosap.ntl.bts.gov/view/dot/12500 (accessed on 16 May 2023).
- Madeira, T.; Melício, R.; Valério, D.; Santos, L. Machine learning and natural language processing for prediction of human factors in aviation incident reports. Aerospace 2021, 8, 47. [Google Scholar] [CrossRef]
- Aurino, D.E.M. Human factors and aviation safety: What the industry has, what the industry needs. Ergonomics 2000, 43, 952–959. [Google Scholar] [CrossRef] [PubMed]
- Hobbs, A. An overview of human factors in aviation maintenance. ATSB Safty Rep. Aviat. Res. Anal. Rep. AR 2008, 55, 2008. [Google Scholar]
- Kierszbaum, S.; Klein, T.; Lapasset, L. ASRS-CMFS vs. RoBERTa: Comparing Two Pre-Trained Language Models to Predict Anomalies in Aviation Occurrence Reports with a Low Volume of In-Domain Data Available. Aerospace 2022, 9, 591. [Google Scholar] [CrossRef]
- Yang, C.; Huang, C. Natural Language Processing (NLP) in Aviation Safety: Systematic Review of Research and Outlook into the Future. Aerospace 2023, 10, 600. [Google Scholar] [CrossRef]
- Tanguy, L.; Tulechki, N.; Urieli, A.; Hermann, E.; Raynal, C. Natural language processing for aviation safety reports: From classification to interactive analysis. Comput. Ind. 2016, 78, 80–95. [Google Scholar] [CrossRef]
- OpenAI. ChatGPT API; gpt-3.5-turbo. 2023. Available online: https://openai.com/blog/introducing-chatgpt-and-whisper-apis (accessed on 4 June 2023).
- Reimers, N.; Gurevych, I. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, China, 3–7 November 2019; Association for Computational Linguistics: Hong Kong, China, 2019; pp. 3982–3992. [Google Scholar] [CrossRef]
- Heydarian, M.; Doyle, T.E.; Samavi, R. MLCM: Multi-Label Confusion Matrix. IEEE Access 2022, 10, 19083–19095. [Google Scholar] [CrossRef]
Column Name | Description |
---|---|
ASRS Record Number (ACN) | Unique identifier for each record in the ASRS database; |
Example: 881998, 881724, etc. | |
Date | The date on which the incident occurred is provided in a yyyymm format. This is done to de-identify incidents by removing “Day” information; |
Example: 201004, 201610, etc. | |
Local Time of Day | The incident time is categorized into specific time buckets to maintain anonymity and prevent the inclusion of exact incident times. These time buckets divide the 24-h period into four intervals; |
Example: 0001-0600, 0601-1200, 1201-1800, and 1801-2400 | |
Human Factors | Human Factors in aviation refers to the discipline that examines the impact of human performance, cognition, and behavior on aviation incidents, with the aim of understanding and mitigating factors, such as human error, fatigue, communication breakdowns, and inadequate training, that contribute to accidents or near misses in the aviation industry; |
Example: Communication Breakdown, Confusion, Distraction, Fatigue, Human–Machine Interface, Situational Awareness, Time Pressure, Workload, etc. | |
Contributing Factors/Situations | The factors or circumstances that played a role in the incident’s occurrence, as identified by the reporter (in the narrative) and/or safety analyst; |
Example: Human Factors, Environment Non-Weather-Related, Procedure, and Airspace Structure. Each incident can have multiple contributing factors. | |
Primary Problem | The main cause that led to the incident as identified by the safety analyst; |
Example: Human Factors, Environment Non-Weather-Related, Procedure, and Airspace Structure. However, each incident can have only one primary problem that led to the incident. | |
Narrative | The description of the incident provided by the reporter includes information about the chain of events, “how the problem arose”, and various human performance considerations, such as perceptions, judgments, decisions, and factors affecting the quality of human performance, actions, or inactions; |
Example: A C680, checked on to frequency (very thick accent). I verified his Mode C and verified his assigned altitude of 11,000. I issued a 070 heading out of PVD VOR to intercept the Runway 4R localizer. He said ‘roger, zero seven zero’. Moments later I noticed his altitude out of 10,000. I asked for an altitude verification and issued a climb. Then I pointed the aircraft out to the adjacent facilities who responded that there was no problem and point out approved. Continued with routine handling. Just a language barrier. Just a foreign pilot and language, although we use English as a common language in ATC, can be a barrier. | |
Synopsis | The summary of the incident written by safety analysts; |
Example: A90 Controller described a pilot error event when the flight crew of a foreign-registered aircraft descended below the assigned altitude during vectors to final. |
Human Factors Issue | Definition |
---|---|
Communication Breakdown | Failure in the exchange of information or understanding between pilots, air traffic controllers, or other personnel, leading to potential errors or safety issues in flight operations |
Confusion | State where pilots, air traffic controllers, or other personnel are uncertain or lack clarity about flight information or procedures, potentially compromising flight safety or efficiency |
Distraction | Any event, process, or activity that diverts attention away from a pilot’s primary task of safely controlling the aircraft or prevents air traffic controllers from effectively managing flight operations |
Fatigue | State of mental or physical exhaustion that reduces a pilot’s ability to safely operate an aircraft or perform flight-related duties |
Human–Machine Interface | Problems or difficulties in the interaction between pilots (or other personnel) and aviation equipment or systems, which can hinder operations and potentially compromise flight safety |
Physiological—Other | Can include conditions like fatigue, hypoxia, barotrauma, dehydration, deep vein thrombosis, jet lag, spatial disorientation, effects of G-force, chronic noise and vibration exposure, radiation exposure, and disruptions to circadian rhythms, each resulting from the unique environmental and physical challenges of flight |
Situational Awareness | Refers to a scenario where a pilot or crew has an incomplete, inaccurate, or misinterpreted understanding of their flight environment, which can potentially lead to operational errors or accidents |
Time Pressure | Urgency or stress that pilots or air traffic controllers may experience when they have limited time to make crucial decisions or complete necessary tasks, often impacting safety and operational efficiency |
Training/Qualification | Problems or challenges arising due to insufficient, inadequate, or improper training and certification of aviation personnel, including pilots, air traffic controllers, and maintenance crews, potentially impacting the safety and efficiency of aviation operations |
Troubleshooting | Process of identifying and solving mechanical, technical, operational, or human factors-related problems that occur in the functioning of aircraft or in aviation operations, to maintain safety and efficiency |
Workload | Tasks or responsibilities assigned to aviation personnel, such as pilots, air traffic controllers, or maintenance crews, exceed their capacity, potentially resulting in fatigue, errors, and safety risks |
Other/Unknown | Problems, errors, or challenges occurring within aviation operations that cannot be readily categorized or identified under established categories and might require further investigation |
Generated Feature | Description |
---|---|
Synopsis | A synopsis of the narrative in 1–2 sentences that includes important details, such as the name of the system, and other relevant abbreviations, as necessary. |
Human Factors Issue | A list of human factors issues predicted from the narrative, from the categories Communication Breakdown, Confusion, Distraction, Fatigue, Human–Machine Interface, Other/Unknown, Physiological—Other, Situational Awareness, Time Pressure, Training/Qualification, Troubleshooting, and Workload (mirroring the issues used in the ASRS), along with additional issues that ChatGPT was free to suggest. |
Human Factors Issue (Rationale) | A 1–2 sentence description of the rationale ChatGPT used to decide which human factors issues were relevant. |
Incident Attribution | An entity/entities to whom the incident can be attributed based on the narrative. |
Incident Attribution (Rationale) | A description of the rationale ChatGPT used to attribute the incident to the specified party. |
ACN | Synopsis (Safety Analyst) | Synopsis (ChatGPT) | CS |
---|---|---|---|
BERT Embedding | |||
963743 | As a J3 slowed after landing the pilot over corrected a left drift so the aircraft veered right off the runway where the left wheel hit a prairie dog hole and broke the axle. | During landing at the home airport, the Piper J-3 Cub veered left and the pilot over-corrected, causing the aircraft to turn hard right and run off the right side of the runway. The left tire hit a prairie dog hole and bent the left axle. | 0.963 |
1268182 | A B-747 flight crew attempted to do a 180-degree turn on a runway due to taxiway closures. The Captain began the turn with about seven knots but was unable to complete the turn with that initial ground speed. The nose tire partially departed the runway and cables were used to get the Aircraft back on to the Runway. The aircraft was found to have no damage and cleared for flight. | The crew of Aircraft X attempted to perform a 180-degree turn on Runway 08, which was narrower than the minimum width required for the turn. The crew’s decision to continue with the turn was influenced by the urgency of transporting high-priority cargo, including blood supplies. The aircraft skidded twice during the turn, and the crew decided to stop the aircraft and request tug assistance to complete the turn. The aircraft was eventually pulled back onto the runway and cleared for flight after inspection. | 0.962 |
1230761 | A Line Aircraft Maintenance Technician (AMT) reports finding the inboard end of a left aileron contacting a wing rib directly inboard of the aileron on a Cessna CE-560EP aircraft. Rivets and screws retaining the balance weight cover were gouging into the rib. Aileron had been installed three years earlier, but was found to be incorrect for the aircraft. | During a preliminary inspection of a Cessna CE560EP aircraft, the inboard end of the L/H aileron was found to be contacting the wing rib directly inboard of the aileron. Maintenance Control initially suggested grinding the aileron to get clearance, but a later E/O was issued to grind the rivet down and fly the aircraft under Ferry permit to get repaired properly. Three days later, it was discovered that the aileron was the wrong part number installed three years prior. | 0.959 |
aeroBERT Embedding | |||
1759478 | Air carrier Captain reported a passenger appeared intoxicated during boarding, but was allowed to fly. During the flight the passenger was non-compliant with face mask policy. | During the flight, a passenger appeared to be intoxicated and was non-compliant with mask requirements. The flight crew notified customer service and had the passenger removed upon arrival at the gate. | 0.987 |
1112916 | An experienced pilot was distracted by conversation with a passenger and neglected to remove the tow bar during pre-flight. This was discovered when attempting to reposition into the hangar after the flight. The tow bar was found off the runway with no damage to the aircraft. | The pilot was distracted during pre-flight and failed to remove the aircraft tow bar. The tow bar was found 1800 FT from the departure end of the runway after the flight. No apparent damage was found on the aircraft. | 0.987 |
1158337 | After landing in MIA, a fatigued flight crew realized that they were on Approach Control frequency and had not received landing clearance. | The flight crew forgot to switch to the Tower frequency and landed without clearance due to a lack of situational awareness and fatigue. | 0.985 |
SBERT Embedding | |||
1011027 | A pilot reported loosing control of a Maule Amphibian on takeoff after he lifted off to soon in a crosswind causing a wing to stall and forward parts of the floats to contact the runway. | The pilot lost control of a Maule Amphibian during takeoff in strong crosswind conditions, causing the aircraft to contact the ground with the forward parts of the floats. | 0.924 |
917365 | A Line Mechanic was informed that a lock-pin from the aft service door evacuation slide inflation valve had not been removed after he replaced a MD-88 door slide. The lock-pin was allegedly found six months later during a cabin maintenance visit. | During a cabin maintenance visit, it was found that the lock pin from the aft service door evacuation slide inflation valve had not been removed after the replacement of evacuation slides in a MD-88 aircraft. | 0.921 |
940308 | A Line Mechanic loosely installed mount bolts during replacement of a Center Tank Fuel Quantity Transient Suppression Device (TSD) on a B737-300 aircraft to allow another Mechanic to connect a wire harness to the TSD. Mechanic realized after the aircraft departed that the mount bolts may not have been tightened. | During maintenance work on a B737-300 aircraft, a lack of communication between two Aviation Maintenance Technicians (AMTs) led to the temporary installation of the Center Fuel Quantity Transient Suppression Device (TSD) box with mount bolts finger tight. The AMT who installed the TSD harness was not informed about this, leading to loose bolts. The issue was later corrected by another AMT during a Routine Overnight (RON) inspection. | 0.920 |
ACN | Synopsis (Safety Analyst) | Synopsis (ChatGPT) | CS |
---|---|---|---|
BERT Embedding | |||
837587 | A light twin and a single engine have a NMAC at Isla Grande. | The pilot was unable to communicate effectively with the tower due to a congested frequency, resulting in a near mid-air collision with another aircraft on converging flight paths. | 0.559 |
1889454 | Air carrier ground personnel reported Hazmat cargo was transported on wrong aircraft. | Hazmat intended for a later mainline trip was mistakenly loaded onto an express trip. | 0.621 |
1231062 | CE525 Captain expresses confusion over RNAV SID’s STAR’s after modification by ATC, in general and the UECKR1 specifically. | The flight crew received conflicting instructions from ATC during clearance, which led to confusion and misinterpretation of the clearance. | 0.623 |
aeroBERT Embedding | |||
1671165 | C172 Pilot reported a NMAC while training. | During air work, the pilot was distracted by maintaining airspeed and bank angle while performing an emergency descent, leading to a lack of situational awareness. The pilot did not take sufficient steps to confirm the other aircraft’s position, and the two aircraft passed each other with a relatively small clearance. | 0.890 |
1878408 | Small Aircraft Instructor Pilot reported a NMAC. | On DATE at XA:20, a flight crew and their student were flying in the Northeast Practice Area in Arizona when they encountered another aircraft, Aircraft Y, that was not following the right of way standard right turn to avoid traffic. The incident was caused by a communication breakdown between the flight crew and the other aircraft on the NE Practice area frequency. | 0.882 |
1638197 | C310 pilot reported flying VFR into IMC. | The pilot cancelled IFR based on the belief that VMC prevailed in the area, which was not the case. The pilot had an unreasonable belief that the weather would improve and did not consider requesting immediate IFR clearance or performing a 180 turn. The incident was caused by the pilot’s decision-making and actions. | 0.894 |
SBERT Embedding | |||
837587 | A light twin and a single engine have a NMAC at Isla Grande. | The pilot was unable to communicate effectively with the tower due to a congested frequency, resulting in a near mid-air collision with another aircraft on converging flight paths. | 0.123 |
1409509 | C172 pilot reported an NMAC in the vicinity of IGX airport. | During a VFR flight utilizing flight following, the pilot failed to set the ascent rate on the autopilot, causing the aircraft to hold the current altitude instead of climbing to the requested altitude of 4500 feet. This led to a potential conflict with another aircraft, and the pilot had to descend further to maintain separation. | 0.162 |
1028402 | EMB145 Captain describes the factors that resulted in missing a crossing restriction during the GIBBZ1 RNAV arrival to IAD. | The flight crew deviated from a new arrival procedure due to the First Officer’s inexperience in the aircraft and uncertainty about an acceptable rate of descent. The Captain’s focus on the next waypoint ahead of the current one led to a delay in realizing the aircraft was too high. | 0.218 |
ACN | Incident Narrative | Label (Analyst) | Label (ChatGPT) |
---|---|---|---|
1758247 | After reporting at XA:15 for a XB:45 pushback to ZZZZ, we were delayed for a maintenance issue. Our CCO time was XD:28, we waived and agreed to extend for 2 h until XF:28. The time ran out and we were delayed because of legality issues. It took us 1 h to get to the hotel, because of late night and COVID-19 procedures. I entered my room at XG:30. I tried to find some food because the intention was to complete the 15 h flight to ZZZZ. I found no food so elected to try to rest. I rolled around trying to decompress and unwind from the on again off again rolling delay of the maintenance issue. I finally fell asleep around XJ:00. At XK:00, the hotel started to come alive. There were numerous door openings and closings. They were obviously crew members leaving for work. Not 45 min after numerous crew members departed for work housekeeping arrived to clean and disinfect the rooms. It seemed like due to COVID-19 the process took longer and was much louder than it should have been. By that time it was XN:00 and I had slept for maybe 60 min. I decided I was not fit to fly due to operational caused fatigue and informed the crew desk at XO:00. | Physiological—Other | Fatigue |
1568978 | (It was a) normal pattern to a base leg. After deploying flaps, the boost pump was accidentally switched on, which flooded the engine and caused a total loss of engine power. Due to the low altitude and the flaps being fully deployed, the aircraft could not glide to the runway, so I made the decision to take control and land off to the left side just short of the runway. The aircraft stopped rolling approximately 25 feet before the runway threshold. There appears to be no damage to the area where the landing occurred or to the aircraft itself. There is no sign of mechanical failure, only pilot error. | Other/Unknown | Distraction |
1021950 | July 2012. On Aircraft X, an ERJ-190 aircraft, I removed and replaced the Captain’s Pitch-Trim switch in accordance with Aircraft Maintenance Manual (AMM). Did a Return to Service and Operational Test per AMM. Operational Check good, MEL was cleared. Two days later, Pilot in ZZZ1 reported Captain’s yoke Elevator Trim (Pitch Trim) switch operates opposite to input. In ZZZ1, Captain’s [Pitch] Trim switch was re-installed per AMM and Operational Check good, OK for service. | Communication Breakdown; Confusion; Situational Awareness; Troubleshooting | Maintenance |
1013382 | I was informed by operations that the aircraft that I flew had the wrong database installed in the FMS. It had a B777 database instead of an MD11 database. I did check the date of the database but did not check the numbers at the top to verify correct installation. After seeing the same thing at the top of the screen I rarely verify the correct database installed! We proceeded to destination without any issues or ATC questions so I am assuming the data base was close enough to operate a MD11. Have Maintenance verify the aircraft type and database type before installation then reverify after load. | Human–Machine Interface; Situational Awareness | Training/Qualification |
834159 | When I was doing my safety check I noticed that the window on door 2 L was covered in condensation. The mechanic came on removed the interior window and wiped off the condensation. He replaced the interior window stated it would be deferred and was ‘good’ for 100 h. Upon landing at our destination the window was again 100 percent blocked. 1R was 100 percent blocked and 2R was about 60 percent blocked. The plane continued to be released despite the serious safety concerns of the flight attendants. We went on to our next destination and again upon landing 2 L window was 100 percent blocked. | Troubleshooting | Maintenance |
Class | Precision | Recall | F1 Score | Support |
---|---|---|---|---|
Communication Breakdown | 0.67 | 0.62 | 0.64 | 4332 |
Confusion | 0.67 | 0.05 | 0.1 | 2570 |
Distraction | 0.53 | 0.38 | 0.44 | 2072 |
Fatigue | 0.71 | 0.69 | 0.7 | 481 |
Human–Machine Interface | 0.44 | 0.04 | 0.08 | 1210 |
Other/Unknown | 0.19 | 0.03 | 0.05 | 609 |
Physiological—Other | 0.42 | 0.25 | 0.32 | 208 |
Situational Awareness | 0.74 | 0.52 | 0.61 | 6475 |
Time Pressure | 0.59 | 0.15 | 0.24 | 1132 |
Training/Qualification | 0.32 | 0.47 | 0.38 | 1649 |
Troubleshooting | 0.45 | 0.07 | 0.12 | 455 |
Workload | 0.48 | 0.16 | 0.24 | 1305 |
Weighted Average | 0.61 | 0.38 | 0.43 | 22,498 |
Entity Associated with Incident | Count |
---|---|
Flight Crew | 5744 |
ATC | 1838 |
Ground Personnel | 805 |
Maintenance | 738 |
Flight Crew, ATC | 133 |
ACN | Incident Narrative | Incident Attribution (ChatGPT) | Rationale (ChatGPT) |
---|---|---|---|
1805938 | The lead Flight Attendant was so uptight with a passenger not complying with the Mask policies, (Flight Attendant) thought it was prudent to chime the cockpit during landing rollout and advise the flight crew to notify a Supervisor to meet the aircraft. This was a direct violation of sterile cockpit procedures. Because flight crews repeatedly experience chimes from the cabin while in simulator training, they are most associated with fire or critical aircraft issues. This is never a good time to chime the cockpit unless it is an emergency. Especially for a mask compliance issue. Relaxed emphasis from Leadership on the tattling on passengers for improper use of masks (never saw this type of behavior from FA when it came to early seat-belt releases. Reissued sterile cockpit etiquette with a bulletin on scenario that are and are not acceptable for chiming the cockpit during landing rollout. | Flight Crew | The flight attendant’s actions caused a distraction to the flight crew during landing rollout. |
874307 | I was working Aircraft X inbound to ADS. Along with this aircraft I had several other VFR pop up aircraft that had to be low level to remain VFR, typically around 1500 feet. Aircraft X was at 2000 on a vector for an ILS approach approximately 15 miles southeast of ADS. I observed the aircraft in proximity to a 2200 foot MVA. I, for some reason, registered that the aircraft was VFR and instructed the aircraft to maintain VFR at or below 2500 so that he could maintain obstacle clearance. The pilot complied and descended to 1600 to VFR conditions. I pointed out the antenna and the pilot reported it in sight. It was then that I realized that the aircraft was in fact IFR and I climbed him immediately to 3000. By that time the aircraft was inside the MVA at 1600. Recommendation, this event could have been avoided had I had a better scan. It was complicated with bad weather and having to watch aircraft that were deviating and descending/climbing in areas where they would not normally be being worked by other positions. | ATC | The incident was caused by the controller’s error in providing incorrect instructions to the pilot. |
868384 | Aircraft was presented to Crew in ZZZ with a deferred item open, stating the forward coffee maker and spigot were inop’ed at a previous Station and the deferred MEL included draining the A320 of potable water and deferring the entire water system. While reviewing the Maintenance Release, the First Officer reported the forward lavatory toilet flushed normally. Water (H20) quantity was checked and it was found to be full, in direct contrast to the MEL instructions for the deferral on the Maintenance Release. Maintenance Control was contacted and a Maintenance Report item sent. Contract Maintenance called to aircraft in ZZZ and he reinstated the water system as no faults could be located. A few other write-ups were handled by ZZZ Maintenance Technician and he left the aircraft, with the Deferral placard still located on the forward Cabin Intercommunication Data System (CIDS) panel. This item was not discovered until en-route to ZZZ1. Aircraft was not serviced with potable water in ZZZ, so it operated at least one leg in violation of the MEL. The ZZZ Maintenance Technician stated the aircraft appeared to be not configured correctly for the ‘No’ potable water operation as all the valves had been left open. If the aircraft had a frozen water system as originally expected in the first write-up, how can one drain a water system that is frozen? Rhetorical question but that was the procedure listed under the MEL. | Maintenance | The incident was caused by a maintenance error in deferring the water system and not properly configuring the aircraft for ‘No’ potable water operation. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Tikayat Ray, A.; Bhat, A.P.; White, R.T.; Nguyen, V.M.; Pinon Fischer, O.J.; Mavris, D.N. Examining the Potential of Generative Language Models for Aviation Safety Analysis: Case Study and Insights Using the Aviation Safety Reporting System (ASRS). Aerospace 2023, 10, 770. https://doi.org/10.3390/aerospace10090770
Tikayat Ray A, Bhat AP, White RT, Nguyen VM, Pinon Fischer OJ, Mavris DN. Examining the Potential of Generative Language Models for Aviation Safety Analysis: Case Study and Insights Using the Aviation Safety Reporting System (ASRS). Aerospace. 2023; 10(9):770. https://doi.org/10.3390/aerospace10090770
Chicago/Turabian StyleTikayat Ray, Archana, Anirudh Prabhakara Bhat, Ryan T. White, Van Minh Nguyen, Olivia J. Pinon Fischer, and Dimitri N. Mavris. 2023. "Examining the Potential of Generative Language Models for Aviation Safety Analysis: Case Study and Insights Using the Aviation Safety Reporting System (ASRS)" Aerospace 10, no. 9: 770. https://doi.org/10.3390/aerospace10090770
APA StyleTikayat Ray, A., Bhat, A. P., White, R. T., Nguyen, V. M., Pinon Fischer, O. J., & Mavris, D. N. (2023). Examining the Potential of Generative Language Models for Aviation Safety Analysis: Case Study and Insights Using the Aviation Safety Reporting System (ASRS). Aerospace, 10(9), 770. https://doi.org/10.3390/aerospace10090770