1. Introduction
The grey prediction method is one of the effective methods for the analysis and prediction of systems with “small samples, poor information, and uncertainty”, and its biggest advantage is that the amount of information required for prediction is relatively small, which is more suitable in the case of difficult data acquisition and high prediction accuracy. In grey prediction theory, the most basic prediction model is the GM(1,1) model, which is based on a small amount of information and has been widely used in industry, agriculture, medicine, and other fields [
1]. However, the GM(1,1) model only analyzes and predicts the change pattern of a single variable, ignoring the influence of changes in other influencing factors on the subject of study. Meanwhile, the GM(1,N) model is the basic model of the multivariate grey system modeling method and is compared with the GM(1,1) grey model. This model makes up for the defect of using historical data directly to build a time series prediction model. It can make a holistic and dynamic analysis of multiple factors and reflect the dynamic change relationship between the study variable series and the related factor series.
At present, many scholars have applied multivariate grey prediction models for forecasting in different fields and optimized the GM(1,N) model from different perspectives to improve its prediction accuracy. Jiang S. et al. (2017) combined grey GM(1,N), Markov theory, and metabolic ideas and empirically showed that the relative error of the metabolism GM(1,N)–Markov model was 32.73% smaller than that of the GM(1,N) prediction model [
2].
Wu and Zhang (2018) [
3] proposed a new GMC(1,N) model for information priority accumulation, which adjusted the weight of the data by adding a parameter and ultimately improved the accuracy of the prediction model.
Xiong et al. (2018) [
4] proposed a nonlinear multivariate NGM(1,N) model based on kernel function and grey radius, which was established based on the kernel number and grey radius sequence of an interval grey number sequence. The kernel number and grey radius of the interval grey number sequence were simulated and predicted. According to the formula of kernel function and grey radius, the upper and lower bounds of the interval gray number were derived so as to simulate and predict the model. The NGM(1,N) model based on the interval grey number was used to forecast AQI under haze weather. Experimental results showed that the model had high prediction accuracy.
Wu et al. (2019) [
5] conducted a systematic study of the GM(α, n) model using grey modeling techniques and the forward difference method to calculate the simulated and predicted values by transforming fractional order differential equations into fractional order difference equations and proposed a stochastic testing scheme to verify the accuracy of the new GM(α, n) model. The results showed that the GM(α, n) model had a high potential for energy consumption forecasting in China.
Cheng et al. (2020) [
6] derived the modified model of the conventional grey differential equation based on the GM(1,N) whitening equation. Using the grey differential equation estimation parameters of the improved GM(1,N) model, the parameter estimation methods under three different background values were given. Compared with the traditional model, the simulation accuracy and prediction accuracy of the improved GM(1,N) model were significantly improved.
According to Zeng et al. (2020) [
7], the extreme value of the independent variable was one of the important factors affecting the simulation and prediction results of the GM(1,N) dependent variable, a new multivariate grey prediction model was constructed on the basis of the smooth generation of variable weight independent variable sequences. The performance of NMGM(1,N) was verified with an example
Shen et al. (2021) [
8] proposed an optimized discrete GMC(1,N) model (called ODGMC) to further improve the prediction accuracy and stability of GMC(1,N). In particular, a linear correction term was introduced into the new model, and the time response function of the new model was derived. The ODGMC(1,N) model proposed in this paper not only adjusted the relationship between the dependent and independent variables but also exhibited better stability than GMC(1,N) and its discrete form. The algorithm results showed that the proposed ODGMC(1,N) model had better fitting and prediction accuracy than the traditional GM(1,N) model, GMC(1,N) model, and its discrete form regardless of whether the dependent variable series was increasing, decreasing or fluctuating.
By analyzing the relevant literature, it was found that scholars from different countries have improved and optimized the GM(1,N) model in different research fields and obtained relatively better prediction accuracy compared with the traditional GM(1,N) model. Some studies [
5,
9] combined the kernel function with GM(1,N) through the combination of embedded models, or they cleverly combined the kernel function with GM(1,N). The multi-variable grey prediction model with higher prediction accuracy was established to reduce the prediction error of the model to a certain extent, but the inherent defects of the GM(1,N) model and the optimization of background values were not studied. The authors [
2,
3,
4,
7] studied the optimization of background values of GM(1,N) and GM(1,N) models based on the idea of metabolism and the combination of kernel function and GM(1,N), respectively, to achieve the purpose of improving prediction accuracy. However, these studies did not deeply study the mechanism of GM(1,N) model construction and ignored the inherent defects of the GM(1,N) model, and the prediction accuracy of the model was affected. Aiming at the inherent defects of the GM(1,N) model, some studies [
7,
8] improved the GM(1,N) model from multiple perspectives and achieved good results. However, these studies lacked the optimization of the cumulative order of GM(1,N) and failed to further improve the prediction accuracy of the multivariate grey model.
This paper provided an in-depth analysis of the inherent flaws of the GM(1,N) model and found that there were still some flaws in the construction process of the GM(1,N) model. GM(1,N) solved the whitening equation and derived the approximate time corresponding function in a relatively idealized manner, resulting in its mechanistic flaws [
8]. GM(1,N) treated the parameter column estimates as model parameters of the model approximate time response equation, which was its parametric defects [
10]. GM(1,N) did not mine enough grey action from itself, lacked the study of the effect of the GM(1,N) model performance by the linear relationship of the number of terms
, and when N = 1, the model could not achieve structural equivalence with the GM(1,1) model, indicating its structural defects [
8].
In all, multi-stage and multi-angle improvement and optimization of the multi-variable grey model were carried out, including background value optimization, fractional optimization, linear increment optimization, and so on. Specifically speaking, the paper added a linear correction term and grey action quantity to the GM(1,N) model to compensate for its inherent defects and built the OGM(1,N) model. Then the Particle Swarm Optimization (PSO) algorithm was used to find the optimal background value, and the OBGM(1,N) model was constructed. The OBGM(1,N) model was improved and optimized using the PSO algorithm and fractional order algorithm at multiple levels and angles, which was called the FOBGM(1,N) model. Then the literature in the ScienceDirect database for the last ten years was reviewed, and the carbon emission impact factors of civil aviation were selected by research frequency statistics and correlation analysis. The carbon emission calculation method for civil aviation of the 2006 IPCC Guidelines for National Greenhouse Gas Inventories was introduced, relevant data were collected, and the prediction accuracy of the model before and after optimization was compared and analyzed to prove the effectiveness of the model optimization, so as to propose a carbon emission prediction model with high prediction accuracy for civil aviation management to evaluate the effectiveness of implementing current emission reduction measures. It should be noted that the GM(1,N) model based on fractional order accumulation and background value optimization developed in this paper can also be applied to the carbon emission forecasting of other transportation systems according to practical needs.
2. Basic Theory
2.1. Analysis of the Correlation Characteristics of Traditional Multivariate Grey Prediction Model
In terms of the number of input variables to the model, the grey prediction model is divided into a multi-variable grey prediction model and a univariable grey prediction model. Among them, the GM(1,1) model (i.e., a grey prediction model with a first-order equation and only one variable) is a typical representative of univariate grey prediction models, which take a single time-series data as the modeling object. According to the system operation law contained in the time series data, the grey generation method is used to mine the data information so as to predict the future development of the system. As the prediction model with the widest application range and the most research achievements, the univariable grey prediction model has a simple structure and is easy to understand without considering the impact of impact factors on the future development of the system. However, its fitting results often show saturated S-shaped characteristics or exponential forms and do not consider the influence of external environment changes on the development trend of the system, so it cannot predict the “inflection point” in the development process of the system. Therefore, the limitations of this model are relatively large. As a representative of a multivariate grey prediction model, the GM(1,N) model is a causal prediction method, which consists of an N-1 series of related factors and one series of system characteristics. In the process of model construction, the influence of external environment changes on the development trend of the system is included in the research scope. The multiple regression algorithm is similar to this model; however, the former is based on probability statistics, and the latter is based on grey theory, and there are essential differences between them. Compared to univariate grey prediction models, GM(1,N) models no longer have a limited single simulation capability in terms of structure. For a long time, the GM(1,N) model has been used more for systematic analysis, and its prediction ability has not been widely utilized. This is because the GM(1,N) model has certain defects in the model construction mechanism and structure, which results in its prediction accuracy is often lower than the GM(1,1) model in the process of use.
2.2. GM(1,N)
2.2.1. Model Definition
Let
be the sequence of dependent variables (i.e., the sequence of system characteristics),
. Let
be the sequence of independent variables with a high correlation with
(i.e., the sequence of explanatory variables),
. Further,
is the 1-AGO sequence of
,
. Where
is the immediate mean generating sequence of
, which can be calculated by
In Equation (1)
The GM(1,N) model is expressed as
2.2.2. Parameter Estimation and Time Response Equation of GM(1,N) Model
In the GM(1,N) model,
is called the system development coefficient,
is called the driver term,
is called the driver coefficient, and
is called the parameter column for which the least squares estimate satisfies using
Among them,
B and
Y can be set
Equation (5) is the whitening equation of the GM(1,N) model, also called the shadow equation.
Let the sequence
,
,
and the matrix
be as described above, then the solution of the whitening equation
is
When the variation in
is very small, visualizing
as a grey constant, the approximate time response equation of the GM(1,N) model is
Then the cumulative reduction equation is
2.2.3. Defects of the GM(1,N) Model
In the construction of the GM(1,N) model, there are still some shortcomings that affect the prediction accuracy of the GM(1,N) model.
(1) The GM(1,N) model solves the whitening equation and derives the approximate time equivalent function in a relatively idealized way. In reality, the “small magnitude of variation” of
is difficult to satisfy. This is because the variables represented by
are different, so their development trends, dynamic rules, and change characteristics are generally different, and the amplitude of change is difficult to guarantee, which results in the unstable prediction performance of the GM(1,N) model, resulting in the mechanism defects of the GM(1,N) model [
10].
(2) GM(1,N) is not a time-response equation derived using
, but is computed using its shadow equation. Therefore, the parameter column
calculates its estimates with
. The GM(1,N) model, however, regards the estimated value of the parameter column as the model parameter of the approximate time response of the model. The “dislocation” of the GM(1,N) model parameter estimation and application object causes its unstable prediction performance, which is its parameter defect [
4].
(3) The GM(1,N) model has a simple structure and is a state model and factor model. The model does not have a sufficient amount of grey action mined from itself and lacks a study on the effect of the GM(1,N) model performance by the linear relationship between the number of terms
. In addition, the GM(1,N) model serves as a first-order grey prediction model for
variables, but when N = 1, the model cannot achieve structural equivalence with the GM(1,1) model. This indicates that the GM(1,N) model also has structural flaws, resulting in relatively low prediction accuracy [
10].
2.3. Prediction Accuracy Evaluation System
Mean absolute percentage error (MAPE) can be used to compare the prediction accuracy of different models [
11], which can be obtained with
where absolute error
and relative error
are calculated as
Further, the mean square error (MSE), root mean square error (RMSE), and mean absolute error (MAE) are calculated as
2.4. Carbon Emission Measurement of Civil Aviation
In order to verify the prediction performance of the model, relevant data from the China civil aviation carbon emission prediction study were used in this paper to verify the effectiveness of the model before and after improvement. The World Meteorological Organization (WMO) and the United Nations Environment Programme (UNEP) jointly established the Intergovernmental Panel on Climate Change (IPCC). This paper calculates carbon emissions based on Method 2 of Civil Aviation in Volume 2 of the 2006 IPCC Guidelines for National Greenhouse Gas Inventories (hereinafter referred to as the Guidelines). The flight of an aircraft has two stages: the LTO stage and the cruise stage. The method separates aviation carbon emissions above and below 914 m (3000 feet) during flight, that is, the LTO phase and cruise phase carbon emissions during flight. The specific calculation process is as
where
is the carbon emissions from air transport.
indicates the carbon emissions from air transport in the LTO phase and
indicates the carbon emissions from air transport in the cruise phase.
and
are calculated as
indicates the number of aircraft landings and takeoffs in the national aviation industry; indicates the CO2 emission factor in the LTO phase, using the average value of each aircraft type in the Guide, i.e., 4341 kg/LTO; indicates the fuel consumption in the LTO phase; indicates the fuel consumption per LTO, using the average value of each aircraft type in the Guide, i.e., 1374 kg/LTO; indicates the total fuel consumption; indicates the CO2 emission factor in the cruise phase H; indicates the amount of CO2 per unit calorific value, using the recommended value of 71,500 kg/TJ in the Guide; and L indicates the low-level heating value of aviation fuel, using the recommended value of 44,100 KJ/kg in the Guide for Accounting Methods and Reporting of Greenhouse Gas Emissions of Chinese Civil Aviation Enterprises prepared by the National Development and Reform Commission.
5. Conclusions
In this paper, based on the GM(1,N) model, first, the mechanism defects, parameter defects, and structural defects of the GM(1,N) model were compensated by adding linear correction term and grey action, and the OGM(1,N) model is established. Then, the background value coefficients of the OGM(1,N) model were optimized using the PSO algorithm, and the OBGM(1,N) model was established. Then, by introducing the fractional order idea, this paper uses the PSO algorithm to optimize the cumulative order of the OBGM(1,N) model and extend the order of the OBGM(1,N) model from an integer field to a real number field to establish FOBGM(1,N) model. Five influencing factors were determined to predict China’s civil aviation carbon emissions using civil aviation passenger traffic, civil aviation cargo volume, total civil aviation turnover, civil aviation fuel consumption, civil aviation industry-wide operating income, and civil aviation transportation intensity. Based on the carbon emission prediction data of civil aviation transportation in China, the improvement effect of each model was empirically studied. It can be seen from the prediction results that the prediction error of the model decreases gradually, and the prediction accuracy shows an increasing trend after multi-level and multi-angle improvement and optimization. Among them, the MAPE of the OGM(1,5) model decreased by 24.40% compared with the GM(1,5) model, the MAPE of the OBGM(1,5) model decreased by 24.72% compared with the GM(1,5) model, the MAPE of the FOBGM(1,5) model decreased by 31.86% compared with the GM(1,5) model. It reflects the effectiveness of model improvement and proves the practicability of the FOBGM(1,5) model. From the perspective of algorithm improvement, the idea of improving the GM(1,N) model based on fractional summation and background value optimization has certain practical significance and application prospects and can be applied to the optimization research of other prediction models of grey system theory. Although the paper has carried out multi-angle and multi-level optimization of the model, such as background value, fractional order, linear increase, etc., the model still has room for improvement, such as from the perspectives of initial value improvement, residual correction, and metabolism, as a subject of further research.