Markov-Switching Bayesian Vector Autoregression Model in Mortality Forecasting
Abstract
:1. Introduction
2. Model Formulation
2.1. Choosing a(x) and b(x) for the Age-Specific DLGC Model
2.2. MSBVAR
- the prime notation represents the matrix transpose
- p is the lag length
- is an n-dimensional column vector of endogenous variables at time t
- is an m-dimensional column vactor of exogenous and deterministic variables at time t
- is an n-dimensional column vector of unobserved random shocks at time t
- is an invertible matrix and is an matrix for
- is an matrix for
- is an diagonal matrix for , and , conditional on information until time , is assumed to follow the multivariate normal distribution with mean 0 and variance , where is the identity matrix.
- is the overall tightness of the prior on the error covariance matrix. As it increases, the model moves away from a random walk.
- is the standard deviation or tightness of the prior around the AR(1) parameters.
- is the lag decay; as it increases, it shrinks the higher-order lag coefficients to 0.
- is the standard deviation or tightness around the intercept.
- is a single value for the standard deviation or tightness around the exogenous variable coefficients.
- is the sum of the prior weights of the coefficients; larger values imply differences in stationarity.
- are dummy initial observations or prior drift; larger values allow for common trends.
3. Main Results
3.1. Structural Breaks and Regime-Switching in the Fitted Parameters
3.1.1. Mortality Structural Change during 1990s
3.1.2. Structural Change in Mortality around the 1950s
3.2. Forecasting Strategy Used in This Paper
3.2.1. Constant Mean
- The values of and show high volatility and a noisy pattern; thus, a constant mean is a reasonable choice for forecasting them.
- The values of , k and z are stable near a constant for a number of years; for example, very stably remains near 0.1 over several years, which matches the findings in de Beer and Janssen (2016) and Thatcher (1999) that from soon after 30 years of age the probability of dying increases by about 10% with each successive year of age.
3.2.2. Two Groups of Transformed Parameters to Apply MSBVAR
- Group 1 ():All pairwise correlations are larger than 0.831, with the highest value being 0.959; furthermore, there is no correlation larger than 0.235 in absolute value between the parameters in group 1 and the parameters outside this group, and only those with the single parameter are above 0.1.
- Group 2 ():Several of the correlations in this group have a magnitude of at least 0.349, with three being greater than 0.87, and most of the seven parameters are included in these correlations.
3.3. Forecasting Results
3.3.1. Out-of-Sample Forecasting Comparison Using Rolling-Window Cross-Validation
Model | LC_logPoi | VAR(1) | BVAR(1) | Two-group MSBVAR(1) |
Total MSE of | 0.001909781 | 0.066175792 | 0.003349276 | 0.001550141 |
3.3.2. Future Mortality Forecasting Example
- For parameters showing a stable historical trend, such as , B, C, and (i.e., an almost linear trend), the MSBVAR model consistently provides much narrower 68% confidence intervals than the BVAR(1) model. Conversely, for parameters showing higher volatility in their historical values, such as , , , and , the MSBVAR model provides much wider 68% confidence interval than that BVAR(1). Thus, MSBVAR forecasts are more reasonable continuations of historically observed trends and volatility as compared with BVAR(1). A possible explanation for this advantage is that MSBVAR can recognize a period of time as being in a high-volatility or low-volatility regime, then use less/more volatile distribution estimates for the VAR coefficients when forecasting in the respective regimes. This explanation is further analyzed and supported later in this section.
- As shown by the red/blue curve, i.e., the forecast value for each parameter, BVAR(1) tends to provide a forecast curve with a linear shape to describe only a coarse general tendency, while our model can provide a more flexibly shaped forecast curve when necessary by considering the historical regime switches in the parameter. For example, our model predicts will first drop and then increase in the following ten years, with a “hook” shaped curve, which reflects the historical ups and downs of . However, BVAR(1) only predicts the coarse increasing tendency with a line segment. Another example is ; BVAR(1) forecasts that it will continue the upward trend of the most recent historical data, while MSBVAR forecasts that it will decrease, beginning another one of the regularly occurring downward trends visible in the historical plot of . The pink confidence intervals are more flexible in their shapes, often with asymmetric behavior between the upper and lower bounds. Both and experience structural changes in the period 1953–2020 (see Section 3.1 for details).
- Based on the parameter interpretation in Table 1, BVAR(1)’s one-year through ten-year confidence intervals include several unreasonable prediction values for a number of parameters, while those of our model provide predictions that are within range. Examples include: for BVAR(1), the confidence intervals for A (infant mortality) and B (decline in mortality at age 1) cover values very close to 1; C (childhood mortality decline) and g (the old-age component of asymptotic mortality) have confidence intervals that include values of 1 and even greater; and the confidence interval for (the cap on mortality growth rate for old age) includes negative values. The MSBVAR model provides more acceptable forecast confidence intervals for all of these parameters.
4. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
1 | Strategy for predetermining hyperparameters in our use of MSBVAR: for Group 1 parameters, we used the hyperparameters values: , , , , , , with Sims-Zha normal–flat prior. For Group 2 parameters, among the 46-year training data we fit BVAR on the first 41 years’ data and forecast the mortality for the last 5 years with different hyperparameter values. We estimated the posterior and in-sample fit RMSE for the results and picked the one set of hyperparamenter values with the lowest RMSE, which was used to set the hyperparameter values in the associated out-of-sample forecast (training on the 46-year data and predicting 1, 2, and 3 years ahead). Here, we set , , , and , used the Sims–Zha normal–flat prior, and tried several possible values of , , and . |
2 | Strategy for predetermining hyperparameters in the BVAR(1) model: we applied the same strategy described in the above footnote for the Group 2 parameters in our model. However, following Njenga and Sherris (2020), we set and , which was because the parameter time series were not transformed to be stationary and may have had differences in terms of their trends or stationarity. |
References
- Avraam, Demetris, Séverine Arnold, Dyfan Jones, and Bakhtier Vasiev. 2014. Time-evolution of age-dependent mortality patterns in mathematical model of heterogeneous human population. Experimental Gerontology 60: 18–30. [Google Scholar] [CrossRef] [PubMed]
- Bardoutsos, Anastasios, Joop de Beer, and Fanny Janssen. 2018. Projecting delay and compression of mortality. Genus 74: 17. [Google Scholar] [CrossRef]
- Brouhns, Natacha, Michel Denuit, and Jeroen K. Vermunt. 2002. A Poisson Log-Bilinear Regression Approach to the Construction of Projected Life Tables. Insurance: Mathematics and Economics 31: 373–93. [Google Scholar]
- de Beer, Joop, and Fanny Janssen. 2016. A new parametric model to assess delay and compression of mortality. Population Health Metrics 14: 1–21. [Google Scholar] [CrossRef] [PubMed]
- Fu, Wanying, Barry R. Smith, Patrick Brewer, and Sean Droms. 2022. A New Mortality Framework to Identify Trends and Structural Changes in Mortality Improvement and Its Application in Forecasting. Risks 10: 161. [Google Scholar] [CrossRef]
- Gao, Huan, Rogemar Mamon, Xiaoming Liu, and Anton Tenyakov. 2015. Mortality modelling with regime-switching for the valuation of a guaranteed annuity option. Insurance: Mathematics and Economics 63: 108–20. [Google Scholar] [CrossRef]
- Gylys, Rokas, and Jonas Šiaulys. 2020. Estimation of Uncertainty in Mortality Projections Using State-Space Lee–Carter Model. Mathematics 8: 1053. [Google Scholar] [CrossRef]
- Hainaut, Donatien. 2012. Multidimensional Lee-Carter model with switching mortality processes. Insurance: Mathematics and Economics 50: 236–46. [Google Scholar] [CrossRef]
- Heligman, Larry, and John Pollard. 1980. The age pattern of mortality. Journal of the Institute of Actuaries 107: 49–80. [Google Scholar] [CrossRef]
- Human Mortality Database. n.d. University of California, Berkeley (USA), and Max Planck Institute for Demographic Research (Germany). Available online: www.mortality.org (accessed on 25 August 2022).
- Ignatieva, Katja, Andrew Song, and Jonathan Ziveyi. 2016. Pricing and hedging of guaranteed minimum benefits under regime-switching and stochastic mortality. Insurance: Mathematics and Economics 70: 286–300. [Google Scholar]
- Krolzig, Hans-Martin. 2000. Predicting Markov-Switching Vector Autoregressive Processes. Nuffield College, University of Oxford, Economics Discussion Papers 2000-W31. Oxford: University of Oxford. [Google Scholar]
- Lee, Ronald D., and Lawrence R. Carter. 1992. Modeling and Forecasting U.S. Mortality. Journal of the American Statistical Association 87: 659–71. [Google Scholar] [CrossRef]
- Lin, X. Sheldon, and Xiaoming Liu. 2007. Markov Aging Process and Phase-Type Law of Mortality. North American Actuarial Journal 11: 92–109. [Google Scholar] [CrossRef]
- Litterman, Robert B. 1986. Forecasting with Bayesian Vector Autoregressions: Five Years of Experience. Journal of Business & Economic Statistics 4: 25–38. [Google Scholar]
- Lu, Yang, and Dan Zhu. 2023. Modelling Mortality: A bayesian factor-augmented var (favar) approach. Astin Bulletin 53: 29–61. [Google Scholar] [CrossRef]
- McNown, Robert, and Andrei Rogers. 1989. Forecasting mortality: A parameterized time series approach. Demography 26: 645–60. [Google Scholar] [CrossRef] [PubMed]
- Milidonis, Andreas, Yijia Lin, and Samuel H. Cox. 2011. Mortality Regimes and Pricing. North American Actuarial Journal 15: 266–89. [Google Scholar] [CrossRef]
- Njenga, Carolyn Ndigwako, and Michael Sherris. 2020. Modeling mortality with a Bayesian vector autoregression. Insurance: Mathematics and Economics 94: 40–57. [Google Scholar] [CrossRef]
- Robertson, John, and Ellis Tallman. 1999. Vector autoregressions: Forecasting and reality. Economic Review 84: 4–18. [Google Scholar]
- Shen, Yang, and Tak Siu. 2013. Longevity bond pricing under stochastic interest rate and mortality with regime-switching. Insurance: Mathematics and Economics 52: 114–23. [Google Scholar] [CrossRef]
- Sims, Christopher A., Daniel F. Waggoner, and Tao Zha. 2008. Methods for inference in large multiple-equation Markov-switching models. Journal of Econometrics 146: 255–74. [Google Scholar] [CrossRef]
- Sims, Christopher A., and Tao Zha. 1998. Bayesian Methods for Dynamic Multivariate Models. International Economic Review 39: 949–68. [Google Scholar] [CrossRef]
- Thatcher, A. Roger. 1999. The long-term pattern of adult mortality and the highest attained age. Journal of the Royal Statistical Society: Series A (Statistics in Society) 162: 5–43. [Google Scholar] [CrossRef] [PubMed]
- Thompson, Patrick A., William R. Bell, John F. Long, and Robert B. Miller. 1989. Multivariate time series projections of parameterized age-specific fertility rates. Journal of the American Statistical Association 84: 689–99. [Google Scholar] [CrossRef] [PubMed]
- Zhou, Rui. 2019. Modelling mortality dependence with regime-switching copulas. ASTIN Bulletin: The Journal of the IAA 49: 373–407. [Google Scholar] [CrossRef]
Pars | Meaning | Range | Other Constraints | |
---|---|---|---|---|
Youngest age | A | the level of child mortality, approximate to | (0, 1) | |
B | the mortality probabilities’ difference between age 0 and 1, i.e., | (0, 1) | ||
C | the decline in mortality during childhood | (0, 1) | ||
Teenage years | k | the growth rate of death probability only caused by teenage years factors | ||
z | the “accident hump” age | |||
the maximum death probability only caused by teenage years factors | (0, 1) | |||
Later adulthood | the main increasing process of happens near | |||
g | the maximum death probability only caused by later adulthood factors | (0, 1) | ||
b(x) | growth rate of considering only natural and basic factors | (0, 1) | continuously increasing | |
a(x) | a controlling factor on the increasing of death probability | continuous; satisfying , for x after a certain old age |
Coefficients | Transition Probabilities |
---|---|
Coefficients | Transition Probabilities |
---|---|
Coefficients | Transition Probabilities |
---|---|
Coefficients | Transition Probabilities |
---|---|
logA_diff | logB_diff | logC_diff | log_diff2 | log_diff | log_diff2 | log_diff | log_diff | log_diff2 | logg_diff | |
---|---|---|---|---|---|---|---|---|---|---|
logA_diff | 1.000 | 0.846 | 0.831 | 0.016 | 0.040 | −0.047 | 0.014 | 0.111 | 0.019 | 0.003 |
logB_diff | 0.846 | 1.000 | 0.959 | −0.002 | 0.065 | 0.018 | −0.007 | 0.216 | −0.004 | 0.003 |
logC_diff | 0.831 | 0.959 | 1.000 | 0.038 | 0.089 | 0.058 | −0.025 | 0.235 | −0.006 | 0.017 |
log_diff2 | 0.016 | −0.002 | 0.038 | 1.000 | 0.222 | −0.172 | −0.123 | 0.289 | 0.020 | −0.022 |
log_diff | 0.040 | 0.065 | 0.089 | 0.222 | 1.000 | 0.283 | −0.934 | 0.225 | 0.099 | 0.872 |
log_diff2 | −0.047 | 0.018 | 0.058 | −0.172 | 0.283 | 1.000 | −0.349 | 0.274 | −0.547 | 0.405 |
log_diff | 0.014 | −0.007 | −0.025 | −0.123 | −0.934 | −0.349 | 1.000 | −0.230 | −0.104 | −0.927 |
log_diff | 0.111 | 0.216 | 0.235 | 0.289 | 0.225 | 0.274 | −0.230 | 1.000 | −0.174 | 0.118 |
log_diff2 | 0.019 | −0.004 | −0.006 | 0.020 | 0.099 | −0.547 | −0.104 | −0.174 | 1.000 | 0.083 |
logg_diff | 0.003 | 0.003 | 0.017 | −0.022 | 0.872 | 0.405 | −0.927 | 0.118 | 0.083 | 1.000 |
Trainning Period | Forecast Period | LC_logPoi | VAR(1) | BVAR(1) | Two-Group MSBVAR(1) |
---|---|---|---|---|---|
1953–1998 | 1999–2001 | 4.058080 | 1.846671 | 8.634617 | 1.652785 |
1954–1999 | 2000–2002 | 3.096344 | 3.636803 | 5.286439 | 1.619272 |
1955–2000 | 2001–2003 | 3.998948 | 2.027850 | 2.199262 | 1.654713 |
1956–2001 | 2002–2004 | 3.965374 | 1.105280 | 1.361299 | 1.981227 |
1957–2002 | 2003–2005 | 4.237491 | 4.750569 | 6.453116 | 4.875723 |
1958–2003 | 2004–2006 | 5.315747 | 1.373889 | 1.389112 | 5.079026 |
1959–2004 | 2005–2007 | 3.805303 | 2.024754 | 4.464989 | 4.506862 |
1960–2005 | 2006–2008 | 3.819256 | 6.901466 | 3.070929 | 2.972596 |
1961–2006 | 2007–2009 | 2.166763 | 2.789243 | 3.493766 | 2.536762 |
1962–2007 | 2008–2010 | 2.578418 | 1.371940 | 3.575074 | 1.345337 |
1963–2008 | 2009–2011 | 3.077442 | 8.987640 | 1.806542 | 3.855534 |
1964–2009 | 2010–2012 | 3.788919 | 1.593804 | 2.633839 | 8.409379 |
1965–2010 | 2011–2013 | 3.361986 | 4.207817 | 1.243555 | 1.548707 |
1966–2011 | 2012–2014 | 2.619416 | 2.614010 | 4.466430 | 1.595633 |
1967–2012 | 2013–2015 | 2.357353 | 3.341852 | 4.016176 | 1.841944 |
1968–2013 | 2014–2016 | 2.106681 | 1.134660 | 3.815115 | 1.343410 |
1969–2014 | 2015–2017 | 1.018989 | 7.637350 | 5.350254 | 9.641061 |
1970–2015 | 2016–2018 | 9.301274 | 5.500549 | 3.125023 | 4.138879 |
1971–2016 | 2017–2019 | 2.765426 | 7.897331 | 1.665924 | 2.939270 |
1972–2017 | 2018–2020 | 4.591307 | 2.929639 | 2.578495 | 1.817020 |
Training Period | Forecast Period | LC_logPoi | Two-Group MSBVAR(1) |
---|---|---|---|
1962–2007 | 2008–2012 | 1.122421 | 2.328074 |
1963–2008 | 2009–2013 | 1.263967 | 1.305586 |
1964–2009 | 2010–2014 | 1.207380 | 8.582214 |
1965–2010 | 2011–2015 | 1.001330 | 2.192817 |
1966–2011 | 2012–2016 | 1.409463 | 8.506634 |
1967–2012 | 2013–2017 | 1.129325 | 2.735066 |
1968–2013 | 2014–2018 | 5.705891 | 1.345621 |
1969–2014 | 2015–2019 | 1.030199 | 1.351506 |
1970–2015 | 2016–2020 | 1.015846 | 6.914451 |
Sum of the MSEs of for all 45 forecasts | 2.925156 | 1.985076 |
Full Mean Transition Matrix for Group 1 Parameters | Full Mean Transition Matrix for Group 2 Parameters |
---|---|
BVAR(1) | MSBVAR(1) (The Mean of the 20,000 Draws from Posterior Forecast Density) | ||||||||
---|---|---|---|---|---|---|---|---|---|
The 1st Regime | The 2nd Regime | ||||||||
Intercepts | −2.05862 | −4.28405 | −0.85628 | −0.46341 | −0.58897 | −0.05221 | −1.52664 | −4.69968 | −0.77721 |
AR coefficients | −0.31006 | −2.00789 | −0.25976 | 0.29793 | −1.17564 | −0.03485 | −0.48458 | −2.86573 | −0.37377 |
−0.10818 | −0.05727 | −0.04763 | −0.14130 | 0.13122 | −0.06355 | −0.15121 | 0.18683 | −0.10257 | |
0.62313 | 0.08462 | 0.14539 | 0.56322 | −2.27056 | 0.14019 | 0.87628 | 0.04168 | 0.48325 | |
Posterior residual covariance | 120.56879 | 448.4988 | 76.79942 | 51.88254 | 282.05098 | 35.82726 | 180.67741 | 594.69937 | 113.10181 |
448.49879 | 2552.0473 | 417.17693 | 282.05098 | 2453.57520 | 318.34130 | 594.69937 | 2558.24889 | 491.59018 | |
76.79942 | 417.1769 | 77.05636 | 35.82726 | 318.34130 | 49.05973 | 113.10181 | 491.59018 | 105.75571 |
Model | Time Series (Scaled by 100) | logA_diff | logB_diff | logC_diff | ||||||
---|---|---|---|---|---|---|---|---|---|---|
No. Years Ahead Forecasted | 1-Year | 2-Year | 3-Year | 1-Year | 2-Year | 3-Year | 1-Year | 2-Year | 3-Year | |
BVAR(1) | 16% quantile | −24.40069 | −50.68266 | −43.10678 | 72.96510 | −128.42168 | −82.78513 | 13.93015 | −19.57496 | −11.80056 |
mean | 15.31044 | −7.63767 | 0.23389 | 107.23746 | −39.49984 | 12.75223 | 16.97254 | −7.49284 | 1.91958 | |
84% quantile | 54.81178 | 35.26181 | 42.70323 | 140.9184 | 49.12812 | 107.46391 | 19.97902 | 4.69020 | 15.67848 | |
MSBVAR(1) | 16% quantile | 3.761618 | −23.17480 | −11.93503 | 45.70565 | −114.48584 | −48.72999 | 8.20572 | −20.95895 | −8.49605 |
mean | 20.57650 | −8.34501 | 2.44477 | 111.76375 | −47.43908 | 16.82710 | 20.92243 | −8.66176 | 3.25535 | |
84% quantile | 37.15495 | 6.60100 | 16.32605 | 176.92057 | 20.27684 | 80.27992 | 33.62026 | 3.69373 | 14.73931 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Fu, W.; Smith, B.R.; Brewer, P.; Droms, S. Markov-Switching Bayesian Vector Autoregression Model in Mortality Forecasting. Risks 2023, 11, 152. https://doi.org/10.3390/risks11090152
Fu W, Smith BR, Brewer P, Droms S. Markov-Switching Bayesian Vector Autoregression Model in Mortality Forecasting. Risks. 2023; 11(9):152. https://doi.org/10.3390/risks11090152
Chicago/Turabian StyleFu, Wanying, Barry R. Smith, Patrick Brewer, and Sean Droms. 2023. "Markov-Switching Bayesian Vector Autoregression Model in Mortality Forecasting" Risks 11, no. 9: 152. https://doi.org/10.3390/risks11090152
APA StyleFu, W., Smith, B. R., Brewer, P., & Droms, S. (2023). Markov-Switching Bayesian Vector Autoregression Model in Mortality Forecasting. Risks, 11(9), 152. https://doi.org/10.3390/risks11090152