1. Introduction
Wind shear, which is a complex weather phenomenon characterized by sudden horizontal or vertical changes in wind speed, seriously threatens flight safety. When an aircraft encounters wind shear, its airspeed, climb rate, angle of attack, lift coefficient, and other parameters can suddenly change, which can seriously affect maneuvering and altitude control, and, if not properly handled, can result in an aviation accident. Low-level wind shear, in particular, which can cause drastic horizontal or vertical changes in wind vector directions below 600 m [
1,
2], is a significant takeoff and approach risk factor. Low-level wind shear usually occurs at low altitudes because of severe convection, frontal systems, radiative inversions, and special airspace, all of which can result in a drastic loss of aircraft energy [
3,
4]. Therefore, if the crew does not take effective, timely responses, accidents may occur. These low-level wind shear hazards have been defined by the International Civil Aviation Organization (ICAO) as unsafe events that require additional monitoring. The International Air Transport Association (IATA) reported that 31% of all accidents in 2022 were the result of adverse weather conditions, with the most often cited contributing factors (18% of the accidents) being wind shear and thunderstorms [
5]. In China, the flight quality-monitoring big data platform reported that 21.31% of the red events in 2022 were also wind shear alarm events. Recent typical unsafe wind shear-related events are shown in
Table 1. Although flight crews are given meteorological information in advance and through onboard forward-looking radars (predictive wind shear systems, or PWSs), automatic terminal information services (ATISs) and flight augmentation computers (FACs) can provide some level of detection and warning [
6], and human factors can have a significant impact on the occurrence of unsafe events. Faced with unexpected aviation meteorological threats, such as wind shear events, flight crews need to be able to effectively manage the complex information and make timely decisions that test their knowledge, skills, attitudes, behavioral performances, and threat and error management abilities [
7].
Aviation meteorological situations focused on aircraft movement and wind vectors, divide wind shear events into four types: downwind, upwind, side, and vertical [
8]. Each of these requires different pilot operations. When downwind shear is encountered, there is a sudden decrease in airspeed, which causes the airplane to lose altitude due to the reduction in lift. When encountered during takeoff, the roll is characterized by a slow increase in speed and an inability to leave the ground. If encountered at a high altitude, the airplane loses altitude after passing through the shear line, but the crew has a chance to recover. However, if the plane is at a low altitude or action is delayed, the plane may crash or be forced to touch down outside the runway. When upwind shear is encountered, the airplane may depart from its normal climb or descent path because of sudden increases in airspeed and lift, especially if it is passing through an upwind wind shear area during the approach phase, at which point the crew must deal with the airplane’s higher altitude or speed. Lateral wind shear can cause the airplane to slip, undergo a slope change, and deviate from its original intended trajectory, which can cause it to land on a slope, land with yaw, or even run off the runway during landing. Depending on the landing distance available (LDA), a strong tailwind can adversely affect the landing; therefore, if this risk is not quickly dealt with, the landing could be dangerous. Finally, the most dangerous of all wind shear types is vertical wind shear. When an airplane is subjected to a sudden and strong downdraft, it can suddenly fall abnormally and deviate from its original trajectory. In general, because unexpected meteorological conditions, such as wind shear, can threaten flight safety, knowing how to handle and manage such threats is vital to ensuring flight safety. Based on many incident investigations and line operation safety assessment (LOSA) data, the ICAO developed a pilot competency framework to ensure that pilots have the skills to effectively respond to aviation threats. The competencies include the application of knowledge, procedural operations, regulatory compliance, automated flight path management, and manual control flight path management, as well as non-technical competencies, such as communication, leadership and teamwork, problem solving and decision making, situation awareness, and workload management [
9]. It can be seen that the non-technical abilities are mainly the capacities of Crew Resource Management (CRM). In fact, the ability to handle complex weather threats such as wind shear depends more on their performance of specific behaviors in CRM.
Consequently, the effective assessment of a pilot’s abilities to handle these meteorological threats in flight has attracted widespread attention and is also the core of the data-driven evidence-based training and assessment training proposed by the ICAO, the aims of which are to effectively identify pilot skill deficiencies and reduce the potential risk of unsafe events [
9,
10]. Current pilot competency assessment methods comprise two categories. The first is a comprehensive evaluation method that assesses pilot competencies using an assessment index system, an analytical hierarchy process, and fuzzy comprehensive evaluations [
11,
12,
13]. However, because the determination of the indicator system weights is somewhat subjective, the assessment consistency is limited. The second type is a data-driven assessment method that exploits flight parameter data or pilot behavior data to construct a prediction model [
14,
15] and utilizes machine learning algorithms to predict the pilot’s competency scores [
16]. Although this method can improve the prediction accuracy when the machine learning models are combined, it has poorer interpretability in practical applications and is unable to provide a specific pilot competency evaluation standard. In this study, we focused on the three-dimensional competency assessment (3DCA) criteria in the competency assessment framework model. To assess pilot competency in a wind shear scenario, we propose a method that optimizes the assessment criteria using existing assessment data, which allows the assessment criteria to be automatically generated. Competency assessments based on effectively constructing data-driven quantitative assessment criteria using competency feature representation have rarely been examined.
To assess a pilot’s competency in managing aviation meteorology threats, we propose an adapted competency model that has observable behaviors (OBs) based on wind shear operations and corresponding competency check items. We developed a data-driven competency pilot assessment optimization model based on three-dimensional competency feature modeling and an optimization algorithm to ensure that the model meets the competency quantification assessment standards. Finally, we conducted experiments using simulated wind shear flight training data to verify the effectiveness of the proposed method.
The remainder of this paper is organized as follows.
Section 2 describes the characteristics of different wind shear events and the competency assessment criteria for dealing with these meteorological threats.
Section 3 introduces the wind shear operation-based competency model and the competency feature modeling for the competency assessment method.
Section 4 details the wind shear simulated flight training evaluation experiments and provides the results of a comprehensive analysis to verify the performance of the proposed data-driven competency assessment method, and
Section 5 presents the conclusions.
2. Problem Statements
In actual flight, wind shear is categorized into two types based on the warning methods: predictive wind shear (PWS) and reactive wind shear (RWS) [
6,
17]. PWS refers to the transmission of coherent impulses by the airborne meteorological radar to a certain wind field range in front of the aircraft and then receiving echoes to detect the wet air wind vectors and their distance; that is, PWS detects an imminent wind shear in the plane’s flight path. RWS refers to a wind shear that exists at the aircraft’s current location, the presence of which is determined by the aircraft’s FAC, which collects system and environmental signals from the flight environment.
Table 2 shows the 131 wind shear flight alert events that occurred in China from January 2020 to June 2020, the data for which were extracted from a Flight Operational Quality Assurance system [
18]. As can be seen, there are significant distribution differences in the altitude and flight phases for the PWS and RWS alert events.
Of the 131 wind shear events shown in
Table 2, 86 were RWS events, accounting for 65.65%, and 45 were PWS events, accounting for 34.35%. As there were no correlations between them, more than two-thirds of these actual wind shear events could not be detected in advance by the airborne meteorological radars. The flight phase data for the wind shear events show that there were 48 takeoff and climb phase events, most of which were PWS (34 events), and 83 approach and landing phase events, 72 of which were RWS. The reasons for these differences may be because of the parameters obtained when the wind shear threat factor was being assessed. If the aircraft was already in the wind shear region when the RWS alarm was triggered, the aircraft altitude at the time that the RWS alarm was triggered was the height of the wind shear event.
Table 2 also indicates that 80% of the wind shear events occurred below 200 ft, that is, at a low altitude, which, if improperly handled by the pilots, has a stronger possibility of a dangerous crash.
However, regardless of the wind shear threat and the wind shear alarm, pilots must initiate timely and effective operations [
19]. Pilots can use different methods to identify RWS wind shear events, such as predicting them in advance; observing the wind direction, speed and other characteristics during the phases when the FAC is unable to provide wind shear detection; and paying attention to changes in the avionic system’s flight parameters when the FAC provides wind shear detection information. When an RWS warning is triggered, the flight crew must immediately initiate procedural operations. Airborne meteorological radars can detect PWS events; however, pilots need to acknowledge PWS warnings before takeoff, during takeoff, and when climbing and landing, and they must execute the correct operations to avoid entering the wind shear area. The choice of wind shear procedure indicates a pilot’s competence in managing these types of threats. Effectively assessing these wind shear pilot competencies, therefore, requires an analysis of the entire threat and error management process and assessments of the pilot’s competence behavioral indicators for the various wind shear handling sub-tasks.
At present, the IATA has a 3DCA criteria guideline for assessing pilot competency. The pilot competency assessment needs to measure three competency feature dimensions: the number of OBs demonstrated by pilots (how many); the frequency of the demonstrated OBs (how often); and the threat and error management outcomes (TEM outcomes) [
20]. Thus, as the 3DCA criteria provide three dimensions of competency assessment characteristics, they are sufficient. However, because the grading standard descriptions are uncertain, examiners vary in their subjective understanding of the assessment guidelines, which can lead to assessment misalignments and inconsistencies. Therefore, a competency behavioral feature-based quantitative assessment model based on real flight training data can be used to evaluate pilot competency effectively when there are undesirable weather conditions, such as wind shear.
4. Experiment
To assess the pilots’ wind shear operation competencies, an experiment was conducted in an upset prevention and recovery training course, which is a transitional training course pilots need to complete before being able to fly an aircraft. The pilots selected for this experiment were licensed and all had 250 h of logged flight time. The selected pilots, all young males, were currently undergoing the transitional phase course to become co-pilots. The aircraft they performed well in was an SR-20 or a C172, and the simulator was the BOEING 737NG. Similar competency assessments using simulators are part of the main ICAO- and IATA-recommended EBT retraining programs. The experiment was conducted with the consent of the pilots. Six typical wind shear scenarios or sub-tasks at different flight phases were used to evaluate the pilots’ wind shear procedural operational performances. To verify the performance of the proposed competency assessment model, evaluation data from 200 pilots were obtained from a flight training institution for simulated flight training for wind shear disposition operational assessment 5.1.
4.1. Competency Assessment Standard Setting
Different from traditional methods where instructors directly score pilot competencies based on the OB descriptions, the following steps were followed to construct the competency assessment optimization model and determine the quantitative evaluation criteria for the competency assessment.
- (1)
Construct observation vectors
First, based on typical wind shear conditions, a flight training evaluation comprising six subjects and twenty-seven check items was developed, which included the TEM checklist data, and the 3DCA model of the ‘how many’ and ‘how often’ information. The TEM inspection data corresponded to the ‘TEM outcome’ for the different subjects. The different check items in the inspection worksheet corresponded to the evaluation standards; for the wind shear checklist, there were twenty-seven check items or observations, an example of which is shown in
Table 4.
Based on the wind shear operation evaluation checklist data, a vector of scores for the check items is obtained, such as . Similarly, based on the TEM outcomes, a score vector is obtained for the corresponding TEM results.
- (2)
Construct the correlation matrix for check items and OBs
We used a Delphi survey to solicit the opinions of flight experts and instructors and construct the correlation matrix for the twenty-seven check item observations, the six TEM outcome observations, and the eleven wind shear operation competency OBs, with the value of the corresponding elements being 1 if there was an influential relationship, and 0 if there was not. The final obtained observations OB competency correlation matrix constructed from the flight expert information is shown below, where the wind shear operation-based competency OBs demonstrated are concentrated from OB1.1 to OB5.2.
The observation term–OB correlation matrix for the corresponding TEM outcomes was similarly derived.
- (3)
Build model of competency assessment feature representation and optimization
The observation vector
and the correlation matrix
were combined to construct the OB-based competency assessment matrix
as follows:
Based on the competency feature representations in the 3DCA criteria-based competency assessment model framework, for each pilot evaluation sample, the how many, how often, and TEM outcome information for the OB performances was determined using Equations (5)–(10). For instance, when
,
, and
, the three-dimensional features after normalization were as follows:
4.2. Competency Assessment Model Solution
The historical evaluation checklist data for 200 pilots were divided into a training set (85%) and a test set (15%), after which it was easy to determine the wind shear operation-based competency assessment matrices for all pilots. Based on the assessment method flow, the optimization problem in the previous steps needed to be solved. As the optimization approximation objective for the training set was small (in this case, the training set sample K = N * 85% = 170), the problem was then transformed into twelve non-negative real numbers () for the solution, which made the approximate optimization objective extremely small.
Based on the actual training set score data in the training set, the score distribution from 1 to 5 for the ‘how often’ characteristic dimension is shown in
Figure 2.
In
Figure 2, the 3DCA criteria-based competence behavior indicators show a single-peak distribution for the ‘how often’ characteristic dimension grades, of which there were 1 with 1 point, 2 with 2 points, 56 with 3 points, 103 with 4 points, and 8 with 5 points.
In
Figure 3, the 3DCA model of competence assessment indicators still shows a single-peak distribution for the ‘how many’ characteristic dimension grades, of which there were 0 with 1 point, 1 with 2 points, 2 with 3 points, 157 with 4 points, and 10 with 5 points.
In
Figure 4, the 3DCA model competency behavior indicators for the ‘TEM outcome’ dimensional grades were more concentrated, for which there were a 1-point count of 0, a 2-point count of 2, a 3-point count of 3, a 4-point count of 156, and a 5-point count of 9.
Figure 2,
Figure 3 and
Figure 4 show that the competency feature distributions from the actual flight training sample across all three dimensions are unimodal. The distribution characteristics are representative, which indicates that most pilot competency levels are concentrated at a specific level.
The grid search method was first used to obtain the initial values for the quantitative assessment criteria threshold decision variables for each dimension: , , , and ; , , , and ; and , , , and .
The search was then carried out in steps of h = 0.01 until the approximate optimization objective was minimized, after which the search was stopped. The obtained approximate optimal thresholds were:
,
,
, and
;
,
,
, and
; and
,
,
, and
. The final competency assessment criteria based on the training sample are shown in
Table 5.
4.3. Assessment Result Analysis
4.3.1. Consistency Analysis
The wind shear operation-based competency assessment levels for the test sample data were validated against the real levels given by the flight instructors. Using the obtained competency assessment criteria in
Table 5, the assessment results are given in
Appendix A,
Table A1. The smallest of the three dimensions for each sample is the model assessment level. Consistency analyses often use nonparametric hypothesis testing to calculate the Spearman rank correlation coefficient [
24], with a Spearman correlation coefficient range of [−1, +1]. A Spearman correlation coefficient less than 0 represents a negative correlation, a Spearman correlation coefficient greater than 0 represents a positive correlation, and a Spearman correlation coefficient equal to 0 represents no correlation; the closer the correlation coefficient is to 0, the weaker the correlation, and the closer the correlation coefficient is to −1 or +1, the stronger the correlation. The correlations between the proposed model-based evaluation levels and the real levels were analyzed (
Table 6). Spearman’s correlation coefficient between the levels evaluated from the model and the real levels was r = 0.854,
p < 0.01 (**), which indicates that the proposed model and real levels were significantly correlated.
4.3.2. Accuracy Analysis
To verify the accuracy of the results for the competency assessment model, the test set evaluation data were used for validation by calculating the
for each pilot, from which the
, based on the competency assessment model criterion thresholds, were obtained by taking the minimum of the three as the pilot’s levels. Model assessment level comparisons for the samples and the deviations from the actual levels are shown in
Figure 5.
The analysis based on the proposed model found that 93.33% of the sample agreed with the actual levels, and the other level deviations were all within 1 level of 6.67%, which was an acceptable deviation range because of the possible subjectivity in the instructor’s original checklist observations. Therefore, the proposed competency assessment optimization model based on historical data information provides good results within acceptable rating deviations.
5. Conclusions
Complex and unexpected aviation meteorological conditions are typical flight safety threats. Therefore, an effective pilot competency evaluation method is needed to enhance pilots’ comprehensive skills in dealing with meteorological aviation threats. The 3DCA criteria are a general criterion framework for assessing the competency of the pilots. Because of the uncertain grading descriptions and subjective evaluator judgment, the consistency of the assessment results is not ensured when using the 3DCA criteria directly. In this study, we developed an evaluation method that automatically generates quantitative assessment criteria based on the existing assessment data. With this focus, we constructed an adapted competency model and observable behavior indicators, conducted competency check item and sub-task decomposition, and then, with simulated flight training evaluation data under wind shear conditions, developed a competency assessment criteria optimization model based on three-dimensional competency feature modeling and an algorithm to determine the competency evaluation criteria solution. The consistency and accuracy of the proposed competency evaluation method were validated on actual wind shear simulation flight training sample data.
While pilots with different preferred aircraft types behave differently in wind shear restrictions, they have behavioral performance commonalities. Our approach is based on available evaluation data, and the obtained competency quantitative assessment criteria are not affected by the operational wind shear differences for various aircraft types. Meanwhile, because different airlines have their own checklist data for these different training subjects, the quantitative competency assessment criteria need to be generated based on appropriate evaluation data. Therefore, the proposed data-driven competency assessment method can be applied to competency assessment models for other conditions; that is, it can be used for pilot competency assessments under various training stages and to improve pilot abilities to deal with sudden aviation threats. Further, due to the large number of complex conditions that must be simulated for competency assessments, except for the skills that need to be verified in real airplanes, training is generally conducted in simulators. Although the conditions in a real aircraft and a simulator are not the same, the observable behavior data from the simulator training scenarios provide good information about whether a pilot has the competencies to deal with the situation in a real aircraft. However, further verification could be the direction of future research.