The smart airport concept is the future of airport operations, and it may dramatically change the industry towards modern technology adaptation [
1]. As a result of the fourth industrial revolution, the smart airport concept has been evolving all over the world, and it will eliminate the drawbacks of the conventional airport system. According to Bouyakoub et al., Airport 4.0 is a concept that leverages big data and open data to enhance its own innovation [
2]. In its 2019 information circular, “Smart Airport Development Research and Practice Report” [
3], the Department of Airports of the Civil Aviation Administration of China (CAAC) summarises the future development goals and trends of airports through research on and the analysis of international smart airport development and practice in a number of countries, including the US, Europe, the EU, Japan, Singapore and Dubai, where trends in airport development state that “Managers will be able to perceive aircraft security warnings in a timely manner; flight delays will be reduced to a minimum; airport resources will be optimally allocated and resource utilisation will be extremely high”. The above requires airports to accurately identify delayed flights and to take reasonable and effective interventions for potentially delayed flights upon receipt of an alert. The scientific issues involved include the prediction of delayed flights and the implementation of delayed flight rescue measures.
Most of the existing papers on flight delay prediction have focused on both influence factor extraction and prediction models. Khaksar et al. analysed flight delays in the U.S. airline network based on machine learning, and the results of the study showed that visibility, wind, and departure time have large impacts on flight delays [
4]. Truong et al. used two methods, decision trees and Bayesian inference, to predict the probability of flight delay events and constructed several flight delay prediction models from flight data from different sources; they then described the airport-related related important factors and their impacts on flight punctuality [
5]. Wu et al. constructed a flight delay prediction model based on deep SE-DenseNet based on the fusion of flight information, related airport delay information and weather information data, and the experimental results showed that the prediction accuracy improved by about 1.8% after information fusion compared with considering only flight attributes [
6]. Esmaeilzadeh et al. analysed, based on support vector machine, the main factors that cause flight delays, and the analysis showed that delayed delays, slip-out delays, and ground waiting procedures had the greatest impacts on flight delays [
7]. Choi et al. constructed a flight delay prediction model under severe weather conditions based on data mining and supervised machine learning algorithms and compared the prediction results of several algorithms, and the results showed that random forest had the highest prediction accuracy [
8]. Ye et al. constructed four prediction models based on multiple linear regression, support vector machine, extreme random tree and LightGBM, and the results showed that the LightGBM model had the best prediction results [
9]. Thiagarajan et al. constructed six departure flight delay prediction models based on machine learning algorithms, and the experimental results showed that the model constructed based on the gradient-boosting algorithm had the highest prediction accuracy [
10]. Qu et al. constructed two flight delay prediction models based on deep convolutional neural networks, DCNN and SE-DenseNet, and achieved 92.1% and 93.19% prediction accuracy, respectively [
11]. Yazdi et al. constructed three flight delay prediction models, SDA-LM, SAE-LM and SDA, based on deep learning. Experimental results on balanced and unbalanced datasets show that the SDA-LM model has the best prediction effect, with a prediction accuracy of up to 96%, and the prediction effect on balanced datasets is better than that on unbalanced datasets [
12]. Ding et al. constructed a multiclassification prediction model for flight delays based on LightGBM and imbalanced the data by few oversampling techniques with TomekLink, and their prediction accuracy reached more than 90% [
13]. Basturk et al. constructed a flight arrival time prediction model based on random forest and deep neural network considering flight, track and weather information, and the results showed that the prediction error of both could be controlled within 6 min [
14]. Khan et al. proposed a hierarchical integrated machine learning model and used different machine learning algorithms and sampling methods to analyse and validate the proposed model using Hong Kong International Airport as the research object. The results showed that the model constructed based on the SMOTETomek sampling technique and the hyp-free CPCLS machine learning algorithm worked best [
15]. Jiang Yu et al. constructed a departure flight delay prediction model based on a spatio-temporal graph convolutional neural network, and the experimental results showed that the model could significantly improve the accuracy of flight delay prediction compared with the historical averaging method, long and short-term memory recurrent neural network, and stacked self-encoder [
16]. Roger et al. proposed a departure flight delay prediction method based on XGBoost and Logistic, which focuses on the effect of sparse data on the flight delay prediction model. The experimental results show that the method can significantly improve the prediction of the model on sparse data sets [
17].
Existing studies have achieved good results in the extraction of flight delay influencing factors and the construction of prediction models, but the prediction of delayed flights mainly stays in the static stage and provides limited decision support for how to save the delayed flights after identification. In actual operation, predicting and identifying delayed flights is only a means to an end: taking effective measures to avoid delays after they are identified is the goal. Compared with the originating flights, the transit flights have more guaranteed links and require more coordination capability among airports, airlines and ATC. Therefore, considering the whole process of flight transit, the flight transit process is divided into four stages: approach, taxi-in, turnaround and taxi-out. The first three intervention stages are used to predict and identify delays in turn, and by predicting the time spent in the three stages, the possible locations of delays are located, and corresponding intervention measures are taken for different locations of delays. The ultimate goal is to identify delayed flights and provide decision support on what delay intervention measures to take, thereby avoiding flight delays.
This paper consists of four chapters. Chapter 1 introduces the guaranteed process of the transit flight and the key nodes involved and defines the scope of the research on delay prediction in this paper. Chapter 2 introduces the key issues in the transit flight delay prediction method, including the definition of delay thresholds, the complete process of prediction, and the description of the involved models. Chapter 3 verifies the effectiveness of the proposed delay prediction method for transit flights using a busy airport in China as the research object. Chapter 4 summarizes the contents of this paper.