4.2. The Performance Assessment of Volatility Forecasts
Via the analysis of empirical results in
Table 2, the selected empirical model in this study can capture the common features of financial assets well. I then executed the in-sample and out-of-sample volatility forecasts of the 16 models for the the 14 stock indices in the G7 and E7.
Table 3 and
Table 4 report the results of in-sample volatility forecasts, respectively, based on the MAE and RMSE loss functions for the overall period. On the contrary,
Table 5 and
Table 6 list the results of out-of-sample volatility forecasts, respectively, based on the MAE and RMSE loss functions for the forecasting period via using a rolling window approach (for each data series, the eight parametric volatility forecasting models and eight composed volatility forecasting models, totaling sixteen models, were first estimated using a sample of 3250 daily returns, and a volatility forecast for the next period was obtained. Subsequently, the estimation period was rolled forward by adding one new day and omitting the most distant day. By repeating this procedure, the out-of-sample volatility forecasts were calculated for the next 250 days).
Figure 4 shows the trend of actual variance and its two out-of-sample variance forecasts obtained by the GJR-GARCH-SGT and GJR-GARCH-SGT-NN models. From
Figure 4, I observed that there is a significantly sharp value of variance on March of last year owing to COVID-19 spreading througout the world.
Subsequently, regarding
Table 3,
Table 4,
Table 5 and
Table 6, I performed the volatility forecasting performance comparison for four categories of model to explore whether the neural network approach can promote the performance of volatility forecasting, whether the settings of leverage effect and non-normal return distribution can encourage the performance of volatility forecasting and which one of the 16 models possesses the best volatility forecasting performance, and then record the results of performance comparison in columns S1, S2, S3 and S4 of each table (regarding four categories of model, please see
Section 2.2 for more details). The results in columns S1, S2, S3 and S4 of
Table 3,
Table 4,
Table 5 and
Table 6 are also summarized in
Table 7 in order to easily explore the four main issues of this study. I used the data in
Table 3 to illustrate the performance comparison of volatility forecasting for the four categories of the model. Regarding the performance comparison of volatility forecasting for the first category of model, I took the following two examples in
Table 3 to illustrate it. First, regarding the paired models, ‘the GARCH-N and GARCH-N-NN’, the GARCH-N model has the lower value of MAE for the cases of DJ, TSX, FTSE and BVSP whereas the GARCH-N-NN model possesses the lower value of MAE for the cases of the other ten stock indices. For example, regarding the DJ, the value of MAE for the GARCH-N model (1.94301) is lower than that for the GARCH-N-NN model (1.94312). Furthermore, in
Table 3 the results ‘4’ for the GARCH-N model and ‘10’ for the GARCH-N-NN model are recorded in column ‘S1’, respectively corresponding to the rows ‘GARCH-N’ and ‘GARCH-N-NN’. In
Table 7, the above results, ‘4’ and ‘10’, are also recorded in column ‘MAE’ below ‘In-sample’ of S1, respectively corresponding to the rows ‘GARCH-N’ and ‘GARCH-N-NN’. Second, regarding the paired models, ‘the GJR-GARCH-SGT and GJR-GARCH-SGT-NN’, the GJR-GARCH-SGT-NN model possesses the lower value of MAE for all 14 stock indices but the GJR-GARCH-SGT model does not obtain the lower value of MAE. In
Table 3 the results ‘0’ and ‘14’ are then recorded in column ‘S1’, respectively corresponding to the rows ‘GJR-GARCH-SGT’ and ‘GJR-GARCH-SGT-NN’. In
Table 7, the above results, ‘0’ and ‘14’, are also recorded in column ‘MAE’ below ‘In-sample’ of S1, respectively corresponding to the rows ‘GJR-GARCH-SGT’ and ‘GJR-GARCH-SGT-NN’. Regarding the performance comparison of volatility forecasting for the second category of model, I took the following example in
Table 3 to explain it. Regarding the paired models, ‘the GARCH-N and GJR-GARCH-N models’, the GARCH-N model has the lower value of MAE for the cases of MIB, RTSI, BSE and XU100 whereas the GJR-GARCH-N model possesses the lower value of MAE for the other ten stock indices. For example, regarding the MIB, the value of MAE for the GARCH-N model (3.79126) is lower than that for the GJR-GARCH-N model (3.79163). In
Table 3 the results ‘4’ and ‘10’ are then recorded in column ‘S2’, respectively corresponding to the rows ‘GARCH-N’ and ‘GJR-GARCH-N’. In
Table 7, the above results, ‘4’ and ‘10’, are also recorded in column ‘MAE’ below ‘In-sample’ of S2, respectively corresponding to the rows ‘GARCH-N’ and ‘GJR-GARCH-N’. Regarding the performance comparison of volatility forecasting for the third category of model, I took the following example in
Table 3 to explain it. Among a group of models, ‘the GARCH-N, GARCH-T, GARCH-ST and GARCH-SGT’, the GARCH-N model possesses the lowest value of MAE for the cases of DJ, TSX, FTSE, CAC40, DAX, BVSP and RTSI but the GARCH-T model never obtains the lowest value of MAE. For instance, regarding the DJ, the value of MAE for the GARCH-N model (1.94301) is lower than that for the other three GARCH-based models such as the GARCH-T (1.99808), GARCH-ST (1.98032) and GARCH-SGT (1.95943). In other words, regarding the DJ, the GARCH-N model possesses the lowest value of MAE among four GARCH-based models. In addition, the GARCH-ST model has the lowest value of MAE for the cases of BSE, MXX and XU100 whereas the GARCH-SGT model obtains the lowest value of MAE for the cases of MIB, N225, SSE and JKSE. In
Table 3 the results ‘7’, ‘0’, ‘3’, and ‘4’ are then recorded in column ‘S3’, respectively corresponding to the rows ‘GARCH-N’, ‘GARCH-T’, ‘GARCH-ST’ and ‘GARCH-SGT’. In
Table 7, the above results ‘7’, ‘0’, ‘3’, and ‘4’ are also recorded in column ‘MAE’ below ‘In-sample’ of S3, respectively corresponding to the rows ‘GARCH-N’, ‘GARCH-T’, ‘GARCH-ST’ and ‘GARCH-SGT’. Regarding the performance comparison of volatility forecasting for the fourth category of model, I took the following example in
Table 3 to explain it. Among all 16 models, the GJR-GARCH-N model has the lowest value of MAE for the cases of TSX and BVSP. For example, regarding the TSX, the value of MAE for the GJR-GARCH-N model (1.64781) is lower than that for the other fifteen models such as the GARCH-N (1.67592), GARCH-T (1.72006), GARCH-ST (1.70677), GARCH-SGT (1.70920), GJR-GARCH-T (1.69383), GJR-GARCH-ST (1.68906), GJR-GARCH-SGT (1.69515), GARCH-N-NN (1.71390), GARCH-T-NN (1.71509), GARCH-ST-NN (1.71786), GARCH-SGT-NN (1.71812), GJR-GARCH-N-NN (1.68169), GJR-GARCH-T-NN (1.69363), GJR-GARCH-ST-NN (1.69340) and GJR-GARCH-SGT-NN (1.69350). In other words, regarding the TSX, the GJR-GARCH-N model has the lowest value of MAE among all sixteen models. In
Table 3 the result ‘2’ is then recorded in column ‘S4’, corresponding to the row ‘GJR-GARCH-N’. In
Table 7, the above result ‘2’ is also recorded in column ‘MAE’ below ‘In-sample’ of S4, corresponding to the row ‘GJR-GARCH-N’.
Table 7 lists the summary results of performance comparison for the in-sample and out-of-sample volatility forecasts based on the MAE and RMSE loss functions. In other words, the numbers in column ‘MAE’ below ‘In-sample’ of S1, S2, S3 and S4 are respectively summarized from those in columns ‘S1’, ‘S2’, ‘S3’ and ‘S4’ of
Table 3. On the contrary, the numbers in column ‘RMSE’ below ‘In-sample’ of S1, S2, S3 and S4 are respectively summarized from those in columns ‘S1’, ‘S2’, ‘S3’ and ‘S4’ of
Table 4. The numbers in column ‘MAE’ below ‘Out-of-sample’ of S1, S2, S3 and S4 are respectively summarized from those in columns ‘S1’, ‘S2’, ‘S3’ and ‘S4’ of
Table 5. Conversely, the numbers in column ‘RMSE’ below ‘Out-of-sample’ of S1, S2, S3 and S4 are respectively summarized from those in columns ‘S1’, ‘S2’, ‘S3’ and ‘S4’ of
Table 6. In order to easily explore the four main issues of this study, I performed calculations for the summation of all four numbers in column ‘S1’ for each model, as well as columns ‘S2’, ‘S3’ and ‘S4’. For example, regarding the GARCH-N model, the numbers ‘4′ and ‘0’ are respectively in columns ‘MAE’ and ‘RMSE’ below ‘In-sample’ of S1. Moreover, the numbers ‘3′ and ‘5′ are respectively in columns ‘MAE’ and ‘RMSE’ below ‘Out-of-sample’ of S1. Hence, in
Table 7, the summation of all four numbers in column ‘S1’ is equal to 12, and is recorded in the column ‘Sum’ below ‘S1’ and the row ‘GARCH-N’. Regarding the other 15 models, the summation of all four numbers in columns ‘S1’ must be done with the same inference process. With regard to the 16 models, the summation of all four numbers in columns ‘S2’, ‘S3’ or ‘S4’ must also be evaluated with the same inference process. The above summation results in columns ‘S2’, ‘S3’ or ‘S4’ are recorded in the column ‘Sum’ below ‘S1’, ‘S3’ or ‘S4’, respectively. Subsequently, I used all 16 numbers in column ‘Sum’ below ‘S1’, ‘S2’, ‘S3’, and ‘S4’ of
Table 7 to execute the performance comparison of volatility forecasting for four categories of model. As shown by the numbers at column ‘Sum’ below S1 of
Table 7, I found that the numbers for all eight composed volatility forecasting models are far greater than those for all eight corresponding parametric volatility forecasting models. For example, the number for the GARCH-N-NN model (44) is far greater than that for the GARCH-N model (12). These results indicate that the performance for the composed volatility forecasting models is significantly superior to that for the parametric volatility forecasting models. In other words, the neural network approach can significantly improve the performance of volatility forecasting. As reported by the numbers in column ‘Sum’ below S2 of
Table 7, I found that, regarding the parametric volatility forecasting approach, the numbers for all four GJR-based models are far greater than those for all four corresponding GARCH-based models, as shown in panel A of this table. For instance, the number for the GJR-GARCH-N model (38) is far greater than that for the GARCH-N model (18). I also found that, regarding the composed volatility forecasting approach, the numbers for all four GJR-based models are far greater than those for all four corresponding GARCH-based models, as shown in panel B of this table. These results imply that irrespective of the parametric forecasting approach or composed forecasting approach, the performance for the GJR-based models is significantly superior to that of the GARCH-based models. That is to say, the setting of the leverage effect can significantly encourage the performance of volatility forecasting (as shown in
Section 2, the GJR-based model can seize the leverage effect appearing at the financial assets whereas the GARCH-based model cannot.) As illustrated by the numbers in column ‘Sum’ below S3 of
Table 7, I found that the numbers for the models with non-normal distribution are not greater than those for the models with normal distribution based on the same volatility forecasting approach and volatility specification. For example, the number for the GARCH-N model (35) is far greater than those for the GARCH-T (1), GARCH-ST (5) and GARCH-SGT (15) models. Moreover, the number for the GJR-GARCH-N model (45) is far greater than those for the GJR-GARCH-T (2), GJR-GARCH-ST (3) and GJR-GARCH-SGT (6) models. Furthermore, the number for the GARCH-N-NN model (37) is far greater than those for the GARCH-T-NN (13), GARCH-ST-NN (4) and GARCH-SGT-NN (2) models. In addition, the number for the GJR-GARCH-N-NN model (32) is far greater than those for the GJR-GARCH-T-NN (12), GJR-GARCH-ST-NN (3) and GJR-GARCH-SGT-NN (10) models. The above results indicate that irrespective of volatility forecasting approach or volatility specification, the performance of the models with the non-normal distribution is not superior to that of the models with the normal distribution. In other words, the setting of the non-normal return distribution cannot promote the performance of volatility forecasting. As listed by the 16 numbers in column ‘Sum’ below S4 of
Table 7, I found the number for the GJR-GARCH-N-NN model (15) is the greatest. On the contrary, the numbers for the GJR-GARCH-N, GJR-GARCH-T-NN and GJR-GARCH-SGT-NN are all equal to 9, the second greatest among the 16 numbers. The above result indicates that, among the 16 models in this study, the performance of the GJR-GARCH-N-NN models is the best followed by GJR-GARCH-N, GJR-GARCH-T-NN and GJR-GARCH-SGT-NN. In other words, the GJR-GARCH model combined with both the normal distribution and a neural networks approach has the best performance of volatility forecasting among the sixteen models in this study.
In addition, this study also investigates which model is the most suitable for each of the fourteen stock indices. That is, regarding each stock index, which model has the best performance of volatility forecasting in order to find the most suitable model for each stock index. In order to easily explore this issue, I summarized the most superior model for each stock index based on two types of volatility forecasts (in-sample and out-of-sample) and two types of loss function (MAE and RMSE). Taking an example of ‘DJ’ stock index, among the 16 models, the GJR-GARCH-SGT-NN model has the best performance for in-sample volatility forecast based on MAE (respectively, RMSE) as shown in the column ‘DJ’ of
Table 3 (respectively,
Table 4). These results are recorded in column ‘DJ’ and rows ‘MAE’ and ‘RMSE’ of ‘In-sample’ in
Table 8. Conversely, among the 16 models, the GJR-GARCH-N-NN model has the best performance for out-of-sample volatility forecast based on MAE (respectively, RMSE) as shown in the column ‘DJ’ of
Table 5 (respectively,
Table 6). These results are recorded in column ‘DJ’ and rows ‘MAE’ and ‘RMSE’ of ‘Out-of-sample’ in
Table 8. Hence,
Table 8 summarizes the results of the most suitable mode for alternative stock indices. In other words, the results listed in row ‘MAE’ (respectively, ‘RMSE’) of ‘In-Sample’ in
Table 8 are summarized from the results of the performance comparison for the fourth category of model in
Table 3 (respectively,
Table 4). On the other hand, the results listed in row ‘MAE’ (respectively, ‘RMSE’) of ‘Out-of-Sample’ in
Table 8 are summarized from the results of the performance comparison for the fourth category of model in
Table 5 (respectively,
Table 6). From
Table 8, I found that both GJR-GARCH-N-NN and GJR-GARCH-SGT-NN are the most suitable models for the DJ stock index because both the GJR-GARCH-N-NN and GJR-GARCH-SGT-NN appear twice among four cases in column ‘DJ’ in
Table 8 (the four cases in
Table 8 are composed of two types of volatility forecasts (in-sample and out-of-sample) and two types of loss function (MAE and RMSE) when the volatility forecasting of a specific stock index is executed). These results are recorded in row ‘Best model’, corresponding to column ‘DJ’ in
Table 8. On the contrary, GJR-GARCH-N is the most suitable model for the TSX stock index because GJR-GARCH-N is the most relevant for the four cases. These results are recorded in row ‘Best model’, corresponding to column ‘TSX’ in
Table 8. In the same inference process, I found the most suitable models for the others stock indices, and I recorded them in row ‘Best model’ and the column corresponding to the specific stock index in
Table 8. From the results listed in the row ‘Best model’ of
Table 8, I obtained the following conclusion. First, GJR-GARCH-N is the most suitable model for the TSX, BVSP and MXX. Second, GARCH-N-NN is the most suitable model only for the RTSI. Third, GJR-GARCH-N-NN is the most suitable model for the DJ, FTSE, MIB and SSE. Fourth, GJR-GARCH-T-NN is the most suitable model for the DAX, JKSE and XU100. Fifth, the GJR-GARCH-SGT-NN is the most suitable model for the DJ, N225 and BSE. To sum up, the most suitable models for the 14 stock indices are distributed at the GJR-GARCH-N, GARCH-N-NN, GJR-GARCH-N-NN, GJR-GARCH-T-NN and GJR-GARCH-SGT-NN models. These results indicate that the most suitable models are not necessarily the same for each of the 14 stock indices. Regarding the most suitable models above, they possess the setting of leverage effect and further combine with a neural networks approach. As to the setting of distribution, they are randomly distributed at the normal, Student’s t and SGT. Hence, the above conclusions are the same as those obtained from the analysis of previous issues. That is, a neural network approach and the setting of leverage effect can significantly promote the performance of volatility forecasting but the setting of non-normal distribution cannot.