Particle MCMC in Forecasting Frailty-Correlated Default Models with Expert Opinion

Nguyen, Ha

doi:10.3390/jrfm16070334

Open AccessArticle

Particle MCMC in Forecasting Frailty-Correlated Default Models with Expert Opinion

by

Ha Nguyen

Department of Actuarial Studies and Business Analytics, Macquarie Business School, Macquarie University, Sydney, NSW 2109, Australia

J. Risk Financial Manag. 2023, 16(7), 334; https://doi.org/10.3390/jrfm16070334

Submission received: 24 April 2023 / Revised: 10 July 2023 / Accepted: 11 July 2023 / Published: 14 July 2023

(This article belongs to the Special Issue Financial Econometrics and Models)

Download

Browse Figures

Versions Notes

Abstract

:

Predicting corporate default risk has long been a crucial topic in the finance field, as bankruptcies impose enormous costs on market participants as well as the economy as a whole. This paper aims to forecast frailty-correlated default models with subjective judgements on a sample of U.S. public non-financial firms spanning January 1980–June 2019. We consider a reduced-form model and adopt a Bayesian approach coupled with the Particle Markov Chain Monte Carlo (Particle MCMC) algorithm to scrutinize this problem. The findings show that the 1-year prediction for frailty-correlated default models with different prior distributions is relatively good, whereas the prediction accuracy ratios for frailty-correlated default models with non-informative and subjective prior distributions over various prediction horizons are not significantly different.

Keywords:

default risk; frailty; hidden factors; doubly stochastic; expert opinion; Particle Markov Chain Monte Carlo; particle independent metropolis–hastings

1. Introduction

Default forecasting is crucial for financial institutions and investors. Prior to investing in or extending credit to a company, investors and creditors must assess the company’s financial distress risk in order to avoid incurring a significant loss. In the literature on financial distress, default risk modelling can be grouped into two main categories: structural and reduced-form approaches. This paper uses the reduced-form method of correlated default timing. The interested readers may refer to Nguyen and Zhou (2023) for a general view of the literature on reduced-form models of correlated default timing.

Accounting-based measures are the first generation of reduced-form models for predicting the failure of a company. The earliest works predicting this type of financial distress are univariate analyses (Beaver 1966, 1968), which employ financial ratios independently and adopt the cut-off point for each financial ratio in order to improve the precision of classifications for a distinct sample. Altman (1968) conducted a multivariate analysis of business failure based on multiple discriminant analyses by combining the data from numerous financial ratios from the financial statement into a singular weighted index. The second generation of default literature is the logistic model (Ohlson 1980). This method was developed to deal with the shortcomings of the Altman Z-score method. Shumway (2001) attempts to predict defaults and shows that half of the accounting ratios utilized by Altman (1968) and Zmijewski (1984) have poor prediction on the default models, while a large number of market-driven independent variables are significantly associated with default probability. The recent expansion of reduced-form default risk models has centred on duration analysis. Jarrow and Turnbull (1995) and Jarrow et al. (1997) are the pioneers of term structure and credit spread modelling.

With regard to duration analysis, recent research indicates that observable macroeconomic and firm-specific factors may not be sufficient to characterize the variation in default risk, as corporate default rates are strongly correlated with latent factors. The need for and importance of the hidden factor in a default model are discussed in several recent studies, such as Koopman and Lucas (2008), Duffie et al. (2009), Chava et al. (2011), Koopman et al. (2011, 2012), Creal et al. (2014), Azizpour et al. (2018), and Nguyen (2023).

To improve the prediction accuracy of default models, the utilization of expert judgement in the decision-making process is common in practice, as there may not be enough statistically significant empirical evidence to reliably estimate the parameters of complicated models. This problem is considered to be of central interest in simulating a number of debates in the empirical literature regarding the issue of Bayesian inference. In the process of inference, however, the majority of Bayesian analyses utilize non-informative priors formed by formal principles. The theoretical foundation utilized by the majority of Bayesians is that of Savage (1971, 1972) and De Finetti (2017). Despite the fact that non-informative prior distribution plays a crucial role in defining the model for certain problems, it appears that there is an unavoidable drawback, as it is sometimes impossible to specify only non-informative priors and disregard the informative priors. It is observed that Bayes factors are sensitive to the selection of unknown parameters of informative prior distributions, which has a greater likelihood of influencing the posterior distribution. As a consequence, it generates debates regarding the selection of priors. Moreover, real prior information is beneficial for specific applications, whereas non-informative priors do not take advantage of this; consequently, such circumstances require informative priors. In other words, this is where subjective views and expert opinion are combined. Assuming we have a complex, high-dimensional posterior distribution, it is uncertain whether we have exhaustively summarized it. This should likely be completed by an experienced statistician. Choosing informative priors and establishing a connection with expert opinion are still the subject of debate in academic research, and interesting stories about them are still being continued. Recently, there has been research on the default prediction combined with expert opinion using machine learning techniques, such as by Lin and McClean (2001), Kim and Han (2003), Zhou et al. (2015), and Gepp et al. (2018). However, these studies adopt machine learning techniques with single classifiers.

Motivated by these findings, this paper aims to answer the research question of whether adding expert opinions to the frailty-correlated default risk model can give us better prediction results. To do so, we combine prior distributions to the frailty-correlated default model in Duffie et al. (2009) and adopt the Particle MCMC approach in Nguyen (2023) to estimate the unknown parameters and predict the default risk in the model using the dataset of U.S. public non-financial firms spanning 1980–2019. Our findings show that the 1-year prediction for frailty-correlated default models with different prior distributions is relatively good, whereas the prediction accuracy of models decrease significantly as the prediction horizons increase. The results also indicate that prediction accuracy ratios for frailty-correlated default models with non-informative and subjective prior distributions over various prediction horizons are not significantly different. Specifically, the out-of-sample prediction accuracy for the frailty-correlated default models with subjective prior distribution is slightly higher than that of the frailty-correlated default models with uniform prior distribution (95.00% for 1-year prediction, 85.23% for 2-year prediction, and 83.18% for 3-year prediction of the frailty default model with uniform prior distribution; and 96.05% for 1-year prediction, 86.32% for 2-year prediction, and 84.71% for 3-year prediction of the frailty default model with subjective prior distribution).

To obtain the research objective, the remainder of the paper is organized as follows: Section 2 presents the econometric model and the estimation methodology for the frailty-correlated default models with the different prior distributions. Section 3 reports major results. Data and the choice of covariates are also presented in this section. Section 4 provides the model performance evaluation. Section 5 presents the concluding remarks and limitations of the research.

2. Econometric Model

This part outlines the econometric model used by Duffie et al. (2009) and our extension to the model and improvement of the method to examine and forecast default risk at the firm level. We first provide an introduction to the notations used in this study. We consider a complete filtered probability space

(Ω, F, G, P)

, where the filtration

G : = {G_{t}}_{t \in [0, T]}

describes the flow of information over time and P is a real-world probability measure. Further on, we use the standard convention where capital letters denote random variables, whereas lower case letters are used for their values.

The complete Markov state vector is described as

W_{t} = (X_{i t}, Y_{t}, H_{t})

, where we let

W_{t}

be a Markov state vector of firm-specific and macroeconomic covariates,

X_{i t}

a vector of observable firm-specific covariates for firm i at the first observation time

t_{i}

until the last observation time

T_{i}

,

V_{i}

be an unobservable firm-specific covariate,

Y_{t}

be a vector of observable macroeconomic variables at all times, and

H_{t}

be an unobservable frailty (latent macroeconomic factor) variable;

Z_{i t} = (1, X_{i t}, Y_{t})

denote a vector of observable covariates for firms i at time t, where 1 is a constant.

On the event of

s > t

of survival to t, given the information set

F_{t}

, the conditional probability of survival to time

t + τ

is

q (W_{t}, τ) = p (s > t + τ | F_{t}) = E (e^{- \int_{t}^{t + τ} λ (z) d z} | W_{t})

(1)

and the conditional default probability at time

t + τ

is of the form:

p (W_{t}, τ) = p (T < t + τ | F_{t}) = E (\int_{t}^{t + τ} e^{- \int_{t}^{u} λ (z) d z} λ (u) d u | W_{t})

(2)

The information filtration of

{F_{t}}_{t \in [0, T]}

includes the information set of the observed macroeconomic/firm-specific variables:

{Y_{τ}}_{τ \in [0, t]} \cup {(X_{i τ}, D_{i τ})}_{i \in [1, m], τ \in [t_{i}, m i n (t, T_{i})]}

The complete information filtration

{G_{t}}_{t \in [0, T]}

contains the variables in the information filtration of

{F_{t}}_{t \in [0, T]}

and the frailty process

{H_{τ}}_{τ \in [0, t]}

.

The assumptions are imposed as follows:

Assumption 1.

All firms’ default intensities at time t depend on a Markov state vector

W_{t}

which is only partially observable.

Assumption 2.

Conditional on the path of the Markov state process W determining the default intensities, the firm default times are the first event times of an independent Poisson process with time-varying intensities determined by the path of W. This is referred to as a doubly stochastic assumption.

Assumption 3.

Set the level of mean-reversion of H,

α = 0

, the unobserved frailty process H is a mean-reverting Ornstein–Uhlenbeck (OU) process which is given by the stochastic differential equation below:

d H_{t} = - η H_{t} d t + σ d W_{t},

(3)

where

η, α, σ

are parameters;

{W_{t}}_{t \in [0, T]}

is a standard Brownian motion with respect to

(Ω, F, G, P)

; η is a nonnegative constant, the speed of mean-reversion of H; σ is the volatility of the Brownian motion.

In the general case, without Assumption 3, we would need extremely numerically intensive Monte Carlo integration in a high dimensional space due to our large dataset from 1980 to 2019. Thus, we assume process H is an OU process, as in Duffie et al. (2009).

The default intensity of a firm i at the time t is:

λ_{i t} = Λ (S_{i} (W_{t}), θ)

, where

S_{i} (W_{t}) = (Z_{i t}, H_{t})

is the component of the state vector at time t and

θ = (κ, ξ, η, σ)

is a parameter vector to be estimated;

κ

is a parameter vector of the observable covariates Z;

ξ

is a parameter of the frailty variable

H_{t}

,

η

is the speed of mean-reversion of

H_{t}

; and

σ

is a Brownian motion parameter of

H_{t}

. The parameters

η

and

σ

need to be estimated through a mean-reverting OU process, which we assume the unobserved frailty process H will follow. The proportional hazards form is expressed by

Λ ((z, h), θ) = e^{(κ_{1} z_{1} + \dots + κ_{n} z_{n} + ξ h)}

(4)

D_{m}

is the default indicators of m firms. Default indicator

D_{i t}

of the firm i at the time t is defined as:

D_{i t} = \{\begin{matrix} 1 & if firm i defaulted at time t \\ 0 & otherwise \end{matrix}

Now we will start with the conditional probability of the m company. As mentioned above, we let

t_{i}

be the first observation time for firm i and

T_{i}

is the last observation time for firm i. For each firm i and fixed time t, we have

P (D_{i t} = 1 | γ, θ) = λ_{i t} Δ_{t} e^{- λ_{i t} Δ_{t}}

P (D_{i t} = 0 | γ, θ) = e^{- λ_{i t} Δ_{t}}

and then, in our case, the conditional probability of the individual firm is given by

p (Z_{i t}, D_{i t}, H | θ) = λ_{i t} Δ_{t} e^{- λ_{i t} Δ_{t}} D_{i t} + e^{- λ_{i t} Δ_{t}} (1 - D_{i t})

\begin{matrix} \prod_{t = t_{i}}^{T_{i}} p (Z_{i t}, D_{i t}, H | θ) & = & e^{- \sum_{t = t_{i}}^{T_{i}} λ_{i t} Δ_{t}} \prod_{t = t_{i}}^{T_{i}} (D_{i t} λ_{i t} Δ_{t} + (1 - D_{i t})) \end{matrix}

(5)

Thus, the conditional probability of the m firm is expressed as:

\begin{matrix} \prod_{i = 1}^{m} (\prod_{t = t_{i}}^{T_{i}} p (Z_{i t}, D_{i t}, H | θ)) & = & \prod_{i = 1}^{m} [e^{- \sum_{t = t_{i}}^{T_{i}} λ_{i t} Δ_{t}} \prod_{t = t_{i}}^{T_{i}} (D_{i t} λ_{i t} Δ_{t} + (1 - D_{i t}))] \\ = & e^{- \sum_{i = 1}^{m} \sum_{t = t_{i}}^{T_{i}} λ_{i t} Δ_{t}} \prod_{i = 1}^{m} \prod_{t = t_{i}}^{T_{i}} (D_{i t} λ_{i t} Δ_{t} + (1 - D_{i t})) . \end{matrix}

(6)

Applying Bayes’ theorem:

p (θ | Z, D, H) \propto L (θ | Z, D, H) p (θ)

(7)

We have two cases for the prior distribution

p (θ)

: (i) Uniform prior and (ii) Subjective prior.

Prior distribution is uniform

$p (θ | Z, D, H) \propto L (Z, D, H | θ),$

(8)

where $p (θ) \propto 1$ (non-informative prior distribution). This case is exactly studied by Duffie et al. (2009). Our extension to the model by combining with priors is given below.
Prior distribution is subjective

$p (θ | Z, D, H) \propto L (θ | Z, D, H) N (κ, ξ | μ, Σ),$

(9)

where $N (μ, Σ)$ is the multivariate normal prior with a mean vector $μ$ and a covariance matrix $Σ$ .

If the observable covariate process Z is independent of the frailty process H, the likelihood function of intensity parameter vector

θ

is given by

\begin{matrix} L (θ | Z, D) & = \int L (θ | Z, D, h) p_{H} (h) d h \times N (κ, ξ | μ, Σ) \\ = E [\prod_{i = 1}^{m} (e^{- \sum_{t = t_{i}}^{T_{i}} λ_{i t} Δ_{t}} \prod_{t = t_{i}}^{T_{i}} (D_{i t} λ_{i t} Δ_{t} + (1 - D_{i t}))) | Z, D] \times N (κ, ξ | μ, Σ), \end{matrix}

(10)

where

p_{H} (.)

is the unconditional probability density of the unobservable frailty process H.

Now we show how to transform the model with the frailty-correlated defaults to the one combined with the subjective prior distribution. We have found the posterior probability density earlier as

\begin{matrix} p (θ | Z, D, H) & \propto & L (θ | Z, D, H) p (θ) \\ = & L (θ | Z, D, H) N (κ, ξ | μ, Σ) \\ = & L (θ | Z, D, H) \frac{e x p (- \frac{1}{2} {((κ, ξ) - μ)}^{T} Σ^{- 1} ((κ, ξ) - μ)}{\sqrt{{(2 π)}^{n} | Σ |}} . \end{matrix}

(11)

Taking the logarithm of Equation (11)

\begin{array}{l} l o g (L (θ | Z, D, H) \frac{e x p (- \frac{1}{2} {((κ, ξ) - μ)}^{T} Σ^{- 1} ((κ, ξ) - μ))}{\sqrt{{(2 π)}^{n} | Σ |}}) \\ = & l o g (L (θ | Z, D, H) + l o g (\frac{e x p (- \frac{1}{2} {((κ, ξ) - μ)}^{T} Σ^{- 1} ((κ) - μ))}{\sqrt{{(2 π)}^{n} | Σ |}}) . \end{array}

(12)

Recall that the log-likelihood of parameter value

θ

given the observable and hidden variables is given by

\begin{matrix} ℓ (θ | Z, D, H) & = & l o g (e^{- \sum_{i = 1}^{m} \sum_{t = t_{i}}^{T_{i}} λ_{i t} Δ_{t}} \prod_{i = 1}^{m} \prod_{t = t_{i}}^{T_{i}} (D_{i t} λ_{i t} Δ_{t} + (1 - D_{i t}))) \\ = & - \sum_{i = 1}^{m} \sum_{t = t_{i}}^{T_{i}} λ_{i t} Δ_{t} + \sum_{i = 1}^{m} \sum_{t = t_{i}}^{T_{i}} l o g (D_{i t} λ_{i t} Δ_{t} + (1 - D_{i t})) . \end{matrix}

(13)

We proceed to take the logarithm for the second term of the Equation (12)

\begin{array}{l} l o g (\frac{e x p (- \frac{1}{2} {((κ, ξ) - μ)}^{T} Σ^{- 1} ((κ, ξ) - μ))}{\sqrt{{(2 π)}^{n} | Σ |}}) \\ = & - \frac{1}{2 \sqrt{{(2 π)}^{n} | Σ |}} {((κ, ξ) - μ)}^{T} Σ^{- 1} ((κ, ξ) - μ) . \end{array}

(14)

In the second term, the central interest is the covariance matrix. For notational simplicity, set

γ = (κ, ξ)

. It is then rewritten as

(\begin{matrix} (γ_{1} - μ_{1}) c_{1, 1} + (γ_{2} - μ_{2}) c_{2, 1} + \dots + (γ_{n + 1} - μ_{n + 1}) c_{n, 1} \\ (γ_{2} - μ_{1}) c_{1, 2} + (γ_{2} - μ_{2}) c_{2, 2} + \dots + (γ_{n + 1} - μ_{n + 1}) c_{n, 2} \\ ⋮ \\ (γ_{1} - μ_{1}) c_{1, n} + (γ_{2} - μ_{2}) c_{2, n} + \dots + (γ_{n + 1} - μ_{n + 1}) c_{n, n} \\ (γ_{1} - μ_{1}) c_{1, n + 1} + (γ_{2} - μ_{2}) c_{2, n + 1} + \dots + (γ_{n + 1} - μ_{n + 1}) c_{n, n + 1} \end{matrix}) (\begin{matrix} γ_{1} - μ_{1} \\ γ_{1} - μ_{2} \\ ⋮ \\ γ_{n} - μ_{n} \\ γ_{n + 1} - μ_{n + 1} \end{matrix})

\begin{matrix} = & (γ_{1} - μ_{1}) ((γ_{1} - μ_{1}) c_{1, 1} + (γ_{2} - μ_{2}) c_{2, 1} + \dots + (γ_{n} - μ_{n}) c_{n, 1} + (γ_{n + 1} - μ_{n + 1}) \\ c_{n + 1, 1}) + (γ_{2} - μ_{2}) ((γ_{1} - μ_{1}) c_{1, 2} + (γ_{2} - μ_{2}) c_{2, 2} + \dots + (γ_{n} - μ_{n}) c_{n, 2} \\ + (γ_{n + 1} - μ_{n + 1}) c_{n + 1, 2}) + \dots + (γ_{n} - μ_{n}) ((γ_{1} - μ_{1}) c_{1, n} + (γ_{2} - μ_{2}) c_{2, n} \\ + \dots + (γ_{n} - μ_{n}) c_{n, n} + (γ_{n + 1} - μ_{n + 1}) c_{n + 1, n}) + (γ_{n + 1} - μ_{n + 1}) ((γ_{1} - μ_{1}) c_{1, n + 1} \\ + (γ_{2} - μ_{2}) c_{2, n + 1} + \dots + (γ_{n} - μ_{n}) c_{n, n + 1} + (γ_{n + 1} - μ_{n + 1}) c_{n + 1, n + 1}) \\ = & ((γ_{1} - μ_{1}) (γ_{1} - μ_{1}) c_{1, 1} + (γ_{1} - μ_{1}) (γ_{2} - μ_{2}) c_{2, 1} + \dots + (γ_{1} - μ_{1}) (γ_{n} - μ_{n}) c_{n, 1} \\ + (γ_{1} - μ_{1}) (γ_{n + 1} - μ_{n + 1}) c_{n + 1, 1}) + ((γ_{2} - μ_{2}) (γ_{1} - μ_{1}) c_{1, 2} + (γ_{2} - μ_{2}) \\ (γ_{2} - μ_{2}) c_{2, 2} + \dots + (γ_{2} - μ_{2}) (γ_{n} - μ_{n}) c_{n, 2} + (γ_{2} - μ_{2}) (γ_{n + 1} - μ_{n + 1}) c_{n + 1, 2}) \\ + \dots \\ + ((γ_{n} - μ_{n}) (γ_{1} - μ_{1}) c_{1, n} + (γ_{n} - μ_{n}) (γ_{2} - μ_{2}) c_{2, n} + \dots + (γ_{n} - μ_{n}) \\ (γ_{n} - μ_{n}) c_{n, n} + (γ_{n} - μ_{n}) (γ_{n + 1} - μ_{n + 1}) c_{n + 1, n}) + ((γ_{n + 1} - μ_{n + 1}) (γ_{1} - μ_{1}) \\ c_{1, n + 1} + (γ_{n + 1} - μ_{n + 1}) (γ_{2} - μ_{2}) c_{2, n + 1} + \dots + (γ_{n + 1} - μ_{n + 1}) (γ_{n} - μ_{n}) \\ c_{n, n + 1} + (γ_{n + 1} - μ_{n + 1}) (γ_{n + 1} - μ_{n + 1}) c_{n + 1, n + 1}) \\ = & \sum_{j = 1}^{n + 1} \sum_{k = 1}^{n + 1} c_{j k} (γ_{j} - μ_{j}) (γ_{k} - μ_{k}) . \end{matrix}

(15)

Then, the second term can be rewritten as

\begin{matrix} l o g (\frac{e x p (- \frac{1}{2} {(γ - μ)}^{T} Σ^{- 1} (γ - μ))}{\sqrt{{(2 π)}^{n} | Σ |}}) = - \frac{1}{2 \sqrt{{(2 π)}^{n} | Σ |}} {(γ - μ)}^{T} Σ^{- 1} (γ - μ) \\ = - \frac{1}{2 \sqrt{{(2 π)}^{n} | Σ |}} \sum_{j = 1}^{n + 1} \sum_{k = 1}^{n + 1} c_{j k} (γ_{j} - μ_{j}) (γ_{k} - μ_{k}) . \end{matrix}

(16)

We combine terms of Equation (11) to obtain an overall likelihood function given the filtration

G

\begin{array}{l} l o g (L (θ | Z, D, H) + l o g (\frac{e x p (- \frac{1}{2} {(γ - μ)}^{T} Σ^{- 1} (γ - μ))}{\sqrt{{(2 π)}^{n} | Σ |}}) \\ = (- \sum_{i = 1}^{m} \sum_{t = t_{i}}^{T_{i}} λ_{i t} Δ_{t} + \sum_{i = 1}^{m} \sum_{t = t_{i}}^{T_{i}} l n (D_{i t} λ_{i t} Δ_{t} + (1 - D_{i t}))) \\ + \sum_{j = 1}^{n + 1} \sum_{k = 1}^{n + 1} c_{j k} (γ_{j} - μ_{j}) (γ_{k} - μ_{k}) . \end{array}

(17)

Now the central interest is to estimate Equation (17). We used a Bayesian approach coupled with the Particle MCMC algorithm to estimate and forecast the frailty-correlated default models with uniform and subjective prior distributions. Particle filters can be understood as sequential Monte Carlo (SMC) methods, as introduced by Handschin and Mayne (1969) and Handschin (1970). Particles are a set of points in the sample space, and particle filters provide approximations to the posterior densities via these points. Each particle has an assigned weight, and then the posterior distribution can be approximated by a discrete distribution. Several algorithms about particle filters have been proposed in the literature review, and it can be said that the difference between algorithms consist in the way that a set of the particles evolves and adapts to inputs data. Algorithm 1 presents the Sequential Monte Carlo process we applied in our method.

Algorithm 1: Sequential Monte Carlo algorithm

At time t = 1: $\forall n = 1, \dots, N$

(1): Sample $H_{1}^{n} \sim q_{θ} (. | (z_{1}, D_{1}))$
(2): Calculate and normalize the weights

$\begin{matrix} w_{1} (H_{1}^{n}) & : = & \frac{p_{θ} (H_{1}^{n}, (z_{1}, D_{1}))}{q_{θ} (H_{1}^{n} | (z_{1}, D_{1}))} \\ = & \frac{μ_{θ} (H_{1}^{n}) g_{θ} ((z_{1}, D_{1}) | H_{1}^{n})}{q_{θ} (H_{1}^{n} | (z_{1}, D_{1}))}, \end{matrix}$

$W_{1}^{n} : = \frac{w_{1} (H_{1}^{n})}{\sum_{i = 1}^{N} w_{1} (H_{1}^{i})} .$

At time $t = 2, \dots, T$ : $\forall n = 1, \dots, N$

(1): Resample the particles, i.e., sample the indices $A_{t - 1}^{n} \sim G (. | W_{t - 1})$ ,
(2): Sample $H_{t}^{n} \sim q (. | ((z_{t}, D_{t}), H_{t - 1}^{A_{t - 1}^{n}}))$ and set $H_{1 : t}^{n} : = (H_{1 : t - 1}^{A_{t - 1}^{n}}, H_{t}^{n}),$
(3): Calculate and normalize the weights

$\begin{matrix} w_{t} (H_{1 : t}^{n}) & : = & \frac{p_{θ} (H_{1 : t}^{n}, (z_{1 : t}, D_{1 : t}))}{p_{θ} (H_{1 : t - 1}^{A_{t - 1}^{n}}, (z_{1 : t - 1}, D_{1 : t - 1})) q_{θ} (H_{t}^{n} | ((z_{t}, D_{t}), H_{t - 1}^{A_{t - 1}^{n}}))} \\ = & \frac{f_{θ} (H_{t}^{n} | H_{t - 1}^{A_{t - 1}^{n}}) g_{θ} ((z_{t}, D_{t}) | H_{t}^{n})}{q_{θ} (H_{t}^{n} | ((z_{t}, D_{t}), H_{t - 1}^{A_{t - 1}^{n}}))} \end{matrix}$

$W_{t}^{n} : = \frac{w_{t} (H_{1 : t}^{n})}{\sum_{i = 1}^{N} w_{t} (H_{1 : t}^{i})}$

One disadvantage of this approach is that the SMC approximation to

p_{θ} (x_{t} | (y_{1 : T})

deteriorates when

T - t

is too large. Andrieu et al. (2010) have proposed the Particle PIMH method to overcome this difficulty. This is a class of MCMC using the SMC algorithm as its component to design its multi-dimensional proposal distributions. The advantage of this method is that the PIMH sampler does not call for the SMC algorithm to generate all samples which approximate

p_{θ} (x_{1 : T} | y_{1 : T})

but only to choose a sample which can be approximated for

p_{θ} (x_{1 : T} | y_{1 : T})

(see Andrieu et al. 2010). Algorithm 2 presents the PIMH method applied in our model.

Algorithm 2: PIMH algorithm

Set $k = 0$
Sample $S \sim p_{θ} (h_{1 : T} | (z_{1 : T}, D_{1 : T}))$ by SMC Algorithm 1,
Draw $H_{1 : T} (0) \sim {\hat{p}}_{θ} (. | (z_{1 : T}, D_{1 : T}))$ from $S$
Set ${\hat{p}}_{θ} (z_{1 : T}, D_{1 : T}) (0) = {\hat{p}}_{θ} (. | (z_{1 : T}, D_{1 : T}))$
For $k = 1 : N$

(1): Sample $S \sim p_{θ} (h_{1 : T} | (z_{1 : T}, D_{1 : T}))$ by SMC Algorithm 1
Draw $H_{1 : T}^{*} \sim {\hat{p}}_{θ} (. | (z_{1 : T}, D_{1 : T}))$
(2): Draw U with the uniform distribution (0, 1)
If $U < {\hat{p}}_{θ} {(z_{1 : T}, D_{1 : T})}^{*} / {\hat{p}}_{θ} (z_{1 : T}, D_{1 : T}) (k - 1)$
      Set $H_{1 : T} (k) = H_{1 : T}^{*}$
      Set ${\hat{p}}_{θ} (z_{1 : T}, D_{1 : T}) (k) = {\hat{p}}_{θ} {(z_{1 : T}, D_{1 : T})}^{*}$
Else
      Set $H_{1 : T} (k) = H_{1 : T} (k - 1)$
      Set ${\hat{p}}_{θ} (z_{1 : T}, D_{1 : T}) (k) = {\hat{p}}_{θ} (z_{1 : T}, D_{1 : T}) (k - 1)$

In our method, we combine Particle MCMC with the maximum likelihood method to estimate the intensity parameter vector

θ

for the frailty-correlated model. We present the implementation steps in Algorithm 3. See Nguyen (2023) for further discussions about the methods.

Algorithm 3: Particle MCMC Expectation-Maximization algorithm

Initialize
Set i := 0
Set $θ^{(0)} = (\hat{κ}, 0.05, 0.01, 1)$ , where $\hat{κ}$ is an estimate of κ in the model without the hidden factors
Loop
Set i := i + 1
Sample $H^{1}, H^{2}, \dots, H^{N}$ from $p_{H} (. | Z, D, θ^{(i - 1)})$ by PIMH Algorithm 2
Employ the maximum likelihood method to estimate parameters $θ^{(i)}$ from Equation (17) using generated samples $H^{1}, H^{2}, \dots, H^{N}$
Exit when achieving reasonable numerical convergence of the likelihood $L$ .

3. Major Results

3.1. Data Sample

The dataset used to estimate the models is as follows: Short-term risk-free risk (3-month Treasury bill rate) is collected from the Board of Governors of the Federal Reserve System. We use the Compustat North America dataset and the Center for Research in Security Prices (CRSP) database from Wharton Research Data Services. We collected quarterly and annual accounting data for companies in the nonfinancial industry in the United States. Compustat quarterly and annual files contain information regarding both short- and long-term debt. When comparing the values of Debt in Current Liabilities and Total Current Liabilities, for short-term debts, we select the greater value. When the quarterly debt values are missing, we substitute them with the annual debt values if they are available; if they are not, they are treated as the final missing values. Additionally, we include these companies’ stock market data. Historical default rate data are collected from Moody’s database. Our default measure is determined in a similar way to Nguyen (2023). The final dataset contains 2432 U.S Industrial categories with 424,601 firm-month observations with a total of 412 defaults (272 bankruptcies and 140 other defaults) over the period from January 1980 to June 2019.

3.2. The Choice of Covariates

The observable firm-specific/macroeconomic covariate variables used to examine and predict the defaults for the U.S. firms are as follows:

The 3-Month Treasury bill rate (TREASURY RATE): The 3-Month Treasury bill is a short-term U.S. government security with a constant 3-month maturity. The Federal Reserve computes yields for constant maturities by interpolating points along a Treasury yield curve comprised of actively traded issues with term maturities. It is a risk-free rate and has a significant impact on monetary policy (see, for example, Duffie et al. 2007, 2009; Duan et al. 2012; Azizpour et al. 2018; Nguyen 2023).
The Trailing 1-year return on the S&P 500 (SP 500): This variable measures the market return, and its importance has been documented in previous studies (see, for example, Duffie et al. 2007, 2009; Duan et al. 2012; Azizpour et al. 2018; Nguyen 2023).
Distance to Default (D2D): This variable is defined as the number of standard deviations of the annual asset growth of a firm where the firm’s assets are higher than its liabilities (see Merton (1974) for further discussion on this variable). We construct this variable in a similar way to Vassalou and Xing (2004), Hillegeist et al. (2004), Bharath and Shumway (2008), and Nguyen (2023). Duffie et al. (2007, 2009) and Nguyen (2023) find a negative and significant relationship between distance to default and the default intensity of the US firms in Industry category. Duan et al. (2012) also show that the default risk of U.S industry and financial firm firms is significantly and negatively associated with the Distance-to-Default variable.
Firm size (FIRM SIZE): This variable is used to show the measure or quantity of a company’s assets. The importance of this variable was documented in a study by Shumway (2001), Duan et al. (2012), and Nguyen (2023). Firm size is calculated as the logarithm of the assets.
Return on assets ratio (ROA): This is a financial ratio that indicates a company’s ability to generate profit relative to the value of its assets. A higher ROA expressed as a percentage indicates that a company can generate more profits from its assets. A lower ROA indicates productivity and the company’s ability to better its balance sheet management. The return on assets ratio is computed as a ratio of net income to total assets. In the default literature, the profitability ratio is a traditional variable, and its importance has been pointed out since Altman (1968) and is widely used in the finance literature, such as Shumway (2001), Duan et al. (2012); Nguyen (2023).
Financial leverage ratio (LEVERAGE): This ratio, also known as the debt ratio, is used to assess a company’s ability to meet its long-term (one year or longer) debt obligations. These obligations consist of interest payments, the ultimate principal payment, and any other fixed obligations, such as lease payments. This ratio is calculated as the ratio of total liabilities to total assets (see, for example, Ohlson 1980; Zmijewski 1984; Nguyen 2023).
Trailing 1-year firm stock return (FIRM RETURN): This variable is suggested by Shumway (2001) and is widely used in the finance literature (see, for example, Bharath and Shumway 2008; Duffie et al. 2007, 2009; Nguyen 2023). We use a similar formula as Shumway (2001) and Nguyen (2023) to compute this variable.

Table 1 and Table 2 provide definitions and summary statistics for all research covariates used in the sample to predict the frailty-correlated default models with different prior distributions.

3.3. Parameter Estimates

We estimate both default models with both uniform and subjective prior distribution, which enables us to compare two models easily. Table 3 shows the estimates for parameters of default intensities with a uniform prior distribution.

From Table 4, it can be seen that all these variables are statistically significant at traditional confidence levels. The estimate of Distance to Default of

- 0.6309

indicates that a negative shock to the distance to default by one standard deviation increases the default intensity by ≈87.91%. Among firm-specific variables, Distance to Default, which is the volatility-adjusted leverage measure, shows its dominant role in explaining a significant variation of the default intensity. The result of Firm size indicates that larger firms often have more financial flexibility than smaller firms, which can help them better overcome financial distress. The coefficient of Return on assets ratio confirms that firms with high-profits relative to assets are less likely to go bankrupt. The result of financial leverage ratio reports that the higher the debt ratios, the higher the default risk of firms. The 1-year trailing stock return covariate is statistically significant and negatively related to the default intensities of the firms. The observable macroeconomic variables chosen in this study are highly economically and statistically significantly negatively associated with the default intensities of the firms.

\begin{matrix} μ & = (- 3.1 - 0.6 - 1.1 - 0.1 - 0.9 - 0.18 - 0.36 0.53 0.1) \\ Σ & = [\begin{matrix} 0.540000 & - 0.004164 & - 0.008542 & 0.006700 & - 0.017213 & 0.024840 & - 0.008855 & 0.005788 & - 0.000056 \\ - 0.004164 & 0.000440 & 0.000327 & - 0.000088 & 0.000446 & 0.000206 & - 0.000054 & - 0.000067 & 0.000084 \\ - 0.008542 & 0.000327 & 0.006385 & 0.000111 & 0.000912 & 0.000272 & - 0.000554 & - 0.000534 & - 0.000018 \\ 0.006700 & - 0.000088 & 0.000111 & 0.000472 & 0.002533 & - 0.000251 & 0.000129 & - 0.000091 & 0.000023 \\ - 0.017213 & 0.000446 & 0.000912 & 0.002533 & 0.074100 & 0.000619 & - 0.001904 & - 0.000771 & 0.000015 \\ 0.024840 & 0.000206 & 0.000272 & - 0.000251 & 0.000619 & 0.001164 & 0.000457 & - 0.000189 & 0.000041 \\ - 0.008855 & - 0.000022 & - 0.000554 & 0.000129 & - 0.001904 & 0.000457 & 0.008680 & - 0.002102 & 0.000045 \\ 0.005788 & - 0.000053 & - 0.000534 & - 0.000091 & - 0.000771 & - 0.000189 & - 0.002102 & 0.002021 & 0.000010 \\ - 0.000045 & 0.000041 & - 0.000019 & 0.000026 & - 0.000043 & 0.000021 & 0.000057 & 0.000009 & 0.000023 \end{matrix}] \end{matrix}

The role of the frailty effect is not relatively large in our dataset. The volatility and the mean reversion of the hidden factor, which determine the dependence of the unobserved default intensities on the latent variable

H_{t}

, have a highly economically and statistically significant impact on the default intensities of the firms. The frailty volatility is the coefficient

ξ

of the dependence of the default intensity on the OU frailty process H. The coefficient of

0.1096

shows us that an increase of

1 %

of the latent factor volatility will increase the unobserved default intensities by

10.96 %

monthly. This finding is consistent with Duffie et al. (2009) and Nguyen (2023). The estimated mean reversion

η

of frailty factor is approximately

43.60 %

monthly. Brownian motion volatility is statistically significantly positive. In general, signs of coefficients in the frailty-correlated defaults models are no surprise. It can be seen from Table 3 and Table 4 that the signs and scales of estimates in both cases where models with uniform and subjective prior distributions are similar.

4. Out-of-Sample Performance and Robustness Check

To evaluate the model performance, we use the cumulative accuracy profile (CAP) and the accuracy ratio (AR). The companies are divided into two equal groups: estimation and evaluation. We estimate the parameters based on the estimation group and then evaluate the prediction accuracy using the evaluation group. The implementation steps are shown as follows: Firstly, we estimate parameters in the frailty-correlated default model with subjective prior distribution using the historical default rates in the period from 1981 to 2011. Secondly, using the estimation results obtained from Step 1, we forecast the data for the period from 2012 to 2018 based on the covariates time series model for observable firm-specific/macroeconomic covariates. Thirdly, we forecast the data of the frailty variable for the period (2012-2018) using the PIMH Algorithm 2. Fourthly, after obtaining the estimates from Step 1 and the data obtained from Steps 2 and 3, we compute the default probability based on Equation (2). Lastly, we can determine a CAP and its associated AR. The CAPs and ARs for the out-of-sample prediction horizons are displayed in Figure 1 and Figure 2.

Table 5 reports the results of out-of-sample predictions of frailty-correlated models with uniform and subjective prior distributions. From two default models, it can be seen that the prediction ratios of the frailty-correlated default model with subjective prior distribution are higher than those of the model with uniform prior distribution. The out-of-sample prediction accuracy for 1-year prediction on average is good. Specifically, 95 percent for the frailty-correlated model with a non-informative prior distribution and 96.05 percent for the model with a subjective prior distribution. When the time horizon for predictions is extended to three years, the AR of the models suffers a significant decline, falling to 85.23 percent for frailty-correlated models with uniform prior distributions and 86.32 percent for those with subjective prior distributions. We also perform out-of-sample default predictions using the logistic regression method 1 to compare the accuracy with our proposed method in Table 6. The results show that our method has better prediction power compared with the logistic regression method.

Overall, two notable conclusions can be drawn from these parameter estimation results: (i) The 1-year prediction for both models is good and when the prediction horizons increase, the prediction accuracy of the models decreases significantly. (ii) It can be seen that there has not been much difference about prediction accuracy ratios for frailty-correlated default models with non-informative and subjective prior distributions over three out-of-sample prediction horizons, including 2012–2018 for 1-year default distribution, 2013–2018 for 2-year default prediction, and 2014–2018 for 3-year default prediction.

To check the robustness of the estimation results for the frailty-correlated default model with subjective prior distribution, we estimated a subperiod from 1980 to 2011 as a sensitivity test. The outcomes correspond with the signs and magnitude of the entire sample. On the other hand, the value of log-likelihood in the frailty-correlated default model with subjective prior distribution (−2202.45) is larger than that in the frailty-correlated default model with non-informative prior distribution (−2379.61), which confirms that the frailty-correlated default model should incorporate the expert opinion.

5. Concluding Remarks and Limitations

Risk assessment is part of the decision-making process in many fields of discipline including finance. In the financial distress literature, the credit risk evaluation entails the evaluation of the hazard of potential future exposure or probable loss to lenders in the context of lending activities. The effective management of credit risk is a crucial aspect of risk management and crucial to the long-term survival of any bank. Credit risk management’s objective is to maximize the bank’s risk-adjusted return by keeping credit risk exposure within acceptable limits. The ability to accurately forecast a company’s financial distress is a major concern for many stakeholders. This practical relevance has motivated numerous studies on the topic of predicting corporate financial distress. To improve the prediction accuracy of default models, the utilization of expert judgement in the decision-making process is common in practice as there may not be enough a statistically significant amount of empirical evidence to reliably estimate parameters of complicated models. This problems is considered to be of central interest of simulating a number of debates in the empirical literature regarding the issue of Bayesian inference.

This paper proposes a method to add expert judgement to the frailty-correlated default risk model in Duffie et al. (2009) by incorporating subjective prior distributions into the model. Then we employ the Bayesian method coupled with a Particle MCMC approach in Nguyen (2023) in order to evaluate the unknown parameters and predict the default risk models on a historical defaults dataset of 424,601 firm-month observations from January 1980 to June 2019 of 2432 U.S. industrial firms. We compare the prediction results of the frailty-correlated default risk model with uniform and subjective prior distributions together. The findings show that the 1-year prediction for both models are pretty good and the prediction accuracy of models decrease considerably as the prediction horizons increase. The results also indicate that prediction accuracy ratios for frailty-correlated default models with non-informative and subjective prior distributions over various prediction horizons are not significantly different. Specifically, the out-of-sample prediction accuracy for the frailty-correlated default models with uniform distribution is slightly higher than that of the frailty-correlated default models with informative prior distribution over three out-of-sample prediction horizons, including 2012–2018 for the 1-year default distribution, 2013–2018 for the 2-year default prediction, and 2014–2018 for the 3-year default prediction.

The frailty-correlated default model with expert opinion has been designed to estimate and predict the default risk of corporations. The model can be adapted to accommodate any context. However, the model also has its limitations. Firstly, one of the main limitations is that we cannot access inputs of data for expert opinion; therefore, to some certain extent, our results also depend on how we assume the values of priors. Accordingly, the prediction accuracy can be slightly different. It is observed that Bayes factors are sensitive to the selection of unknown parameters from informative prior distributions, which has a greater likelihood of influencing the posterior distribution. As a consequence, it generates debates regarding the selection of priors. According to Kass and Raftery (1995), non-informative priors may also contribute to posterior estimate instability and convergence of the sampler algorithm. Choosing informative priors and establishing a connection with expert opinion are still the subject of debate in academic research, and interesting stories about them are still being continued. Therefore, future work should use actual data of expert opinion, which may be feasibly conducted in the age of big data. Recently, there have been research on the default prediction combined with expert opinion using machine learnings, such as Lin and McClean (2001), Kim and Han (2003), Zhou et al. (2015), and Gepp et al. (2018). However, these studies adopt machine learning techniques with single classifiers and observable variables. Future work can adopt a meta-learning framework to examine and predict defaults with expert opinion at the firm level.

Funding

This research received no external funding.

Data Availability Statement

The firms’ default data that support the findings of this study are available from Moody’s. Restrictions apply to the availability of these data, which were used under license for this study. Data are available at https://www.moodys.com/ (accessed on 10 July 2023) with the permission of Moody’s. The firms’ data and the Trailing 1-year return on the S&P 500 that support the findings of this study are available from Wharton Research Data Services. Restrictions apply to the availability of these data, which were used under license for this study. Data are available at http://whartonwrds.com/ (accessed on 10 July 2023) with the permission of Wharton Research Data Services. The 3-month Treasury bill rate that supports the findings of this study is openly available in Board of Governors of the Federal Reserve System at https://fred.stlouisfed.org/series/TB3MS (accessed on 10 July 2023).

Acknowledgments

I would like to thank Tak Kuen Siu and Tom Smith for their insightful comments and suggestions. I would also like to thank the referees for their helpful comments and suggestions.

Conflicts of Interest

The authors declare no conflict of interest.

Note

1	I would like to thank a referee for suggesting this.

References

Altman, Edward I. 1968. Financial ratios, discriminant analysis and the prediction of corporate bankruptcy. The Journal of Finance 23: 589–609. [Google Scholar] [CrossRef]
Andrieu, Christophe, Arnaud Doucet, and Roman Holenstein. 2010. Particle Markov chain Monte Carlo methods. Journal of the Royal Statistical Society. Series B, Statistical Methodology 72: 269–342. [Google Scholar] [CrossRef] [Green Version]
Azizpour, Shahriar, Kay Giesecke, and Gustavo Schwenkler. 2018. Exploring the sources of default clustering. Journal of Financial Economics 129: 154–83. [Google Scholar] [CrossRef]
Beaver, William H. 1966. Financial ratios as predictors of failure. Journal of Accounting Research 4: 71–111. [Google Scholar] [CrossRef]
Beaver, William H. 1968. Market prices, financial ratios, and the prediction of failure. Journal of Accounting Research 6: 179–92. [Google Scholar] [CrossRef]
Bharath, Sreedhar T., and Tyler Shumway. 2008. Forecasting default with the Merton distance-to-default model. The Review of Financial Studies 21: 1339–69. [Google Scholar] [CrossRef]
Chava, Sudheer, Catalina Stefanescu, and Stuart Turnbull. 2011. Modeling the loss distribution. Management Science 57: 1267–87. [Google Scholar] [CrossRef] [Green Version]
Creal, Drew, Bernd Schwaab, Siem Jan Koopman, and Andre Lucas. 2014. Observation-driven mixed-measurement dynamic factor models with an application to credit risk. Review of Economics and Statistics 96: 898–915. [Google Scholar] [CrossRef] [Green Version]
De Finetti, Bruno. 2017. Theory of Probability: A Critical Introductory Treatment. Hoboken: John Wiley & Sons, vol. 6. [Google Scholar]
Duan, Jin-Chuan, Jie Sun, and Tao Wang. 2012. Multiperiod corporate default prediction—A forward intensity approach. Journal of Econometrics 170: 191–209. [Google Scholar] [CrossRef]
Duffie, Darrell, Andreas Eckner, Guillaume Horel, and Leandro Saita. 2009. Frailty correlated default. The Journal of Finance 64: 2089–123. [Google Scholar] [CrossRef]
Duffie, Darrell, Leandro Saita, and Ke Wang. 2007. Multi-period corporate default prediction with stochastic covariates. Journal of Financial Economics 83: 635–65. [Google Scholar] [CrossRef] [Green Version]
Gepp, Adrian, Martina K. Linnenluecke, Terrence J. O’Neill, and Tom Smith. 2018. Big data techniques in auditing research and practice: Current trends and future opportunities. Journal of Accounting Literature 40: 102–15. [Google Scholar] [CrossRef] [Green Version]
Handschin, Johannes. 1970. Monte Carlo techniques for prediction and filtering of nonlinear stochastic processes. Automatica 6: 555–63. [Google Scholar] [CrossRef]
Handschin, Johannes Edmund, and David Q. Mayne. 1969. Monte Carlo techniques to estimate the conditional expectation in multi-stage non-linear filtering. International Journal of Control 9: 547–59. [Google Scholar] [CrossRef]
Hillegeist, Stephen A., Elizabeth K. Keating, Donald P. Cram, and Kyle G. Lundstedt. 2004. Assessing the probability of bankruptcy. Review of Accounting Studies 9: 5–34. [Google Scholar] [CrossRef]
Jarrow, Robert A., and Stuart M. Turnbull. 1995. Pricing derivatives on financial securities subject to credit risk. The Journal of Finance 50: 53–85. [Google Scholar] [CrossRef]
Jarrow, Robert A., David Lando, and Stuart M. Turnbull. 1997. A Markov model for the term structure of credit risk spreads. The Review of Financial Studies 10: 481–523. [Google Scholar] [CrossRef] [Green Version]
Kass, Robert E., and Adrian E. Raftery. 1995. Bayes factors. Journal of the American Statistical Association 90: 773–95. [Google Scholar] [CrossRef]
Kim, Myoung-Jong, and Ingoo Han. 2003. The discovery of experts’ decision rules from qualitative bankruptcy data using genetic algorithms. Expert Systems with Applications 25: 637–46. [Google Scholar] [CrossRef]
Koopman, Siem Jan, and André Lucas. 2008. A non-Gaussian panel time series model for estimating and decomposing default risk. Journal of Business & Economic Statistics 26: 510–25. [Google Scholar]
Koopman, Siem Jan, André Lucas, and Bernd Schwaab. 2011. Modeling frailty-correlated defaults using many macroeconomic covariates. Journal of Econometrics 162: 312–25. [Google Scholar] [CrossRef]
Koopman, Siem Jan, André Lucas, and Bernd Schwaab. 2012. Dynamic factor models with macro, frailty, and industry effects for U.S. default counts: The credit crisis of 2008. Journal of Business & Economic Statistics 30: 521–32. [Google Scholar]
Lin, Feng Yu, and Sally McClean. 2001. A data mining approach to the prediction of corporate failure. Knowledge-Based Systems 14: 189–95. [Google Scholar] [CrossRef]
Merton, Robert C. 1974. On the pricing of corporate debt: The risk structure of interest rates. The Journal of Finance 29: 449–70. [Google Scholar]
Nguyen, Ha. 2023. An empirical application of Particle Markov Chain Monte Carlo to frailty correlated default models. Journal of Empirical Finance 72: 103–21. [Google Scholar] [CrossRef]
Nguyen, Ha, and Xian Zhou. 2023. Reduced-form models of correlated default timing: A systematic literature review. Journal of Accounting Literature 45: 190–205. [Google Scholar] [CrossRef]
Ohlson, James A. 1980. Financial ratios and the probabilistic prediction of bankruptcy. Journal of Accounting Research 18: 109–31. [Google Scholar] [CrossRef] [Green Version]
Savage, Leonard J. 1971. Elicitation of personal probabilities and expectations. Journal of the American Statistical Association 66: 783–801. [Google Scholar] [CrossRef]
Savage, Leonard J. 1972. The Foundations of Statistics. Chelmsford: Courier Corporation. [Google Scholar]
Shumway, Tyler. 2001. Forecasting bankruptcy more accurately: A simple hazard model. The Journal of Business 74: 101–24. [Google Scholar] [CrossRef] [Green Version]
Vassalou, Maria, and Yuhang Xing. 2004. Default risk in equity returns. The Journal of Finance 59: 831–68. [Google Scholar] [CrossRef]
Zhou, Ligang, Dong Lu, and Hamido Fujita. 2015. The performance of corporate financial distress prediction models with features selection guided by domain knowledge and data mining approaches. Knowledge-Based Systems 85: 52–61. [Google Scholar] [CrossRef]
Zmijewski, Mark E. 1984. Methodological issues related to the estimation of financial distress prediction models. Journal of Accounting Research 22: 59–82. [Google Scholar] [CrossRef]

Figure 1. This graph illustrates the out-of-sample cumulative accuracy profiles (power curves) over the entire sample period (1980–2019) for various prediction horizons. The companies are divided into two equal groups: estimation and evaluation. We estimate the parameters based on the estimation group and then evaluate the prediction accuracy using the evaluation group. The power curve illustrates 20% of companies with the most capacity of default over the different horizons in the frailty-correlated default model with subjective prior.

Figure 2. This graph illustrates average accuracy ratios for out-of-sample prediction in three prediction accuracy horizons for frailty-correlated default models with expert opinion.

Table 1. Observable firm-specific attributes and macroeconomic factors.

No	Covariates	Definitions	Reference
1	TREASURY RATE	3-month US Treasury bill rate	Duffie et al. (2007, 2009),
			Duan et al. (2012), Nguyen (2023)
2	SP500	Trailing 1-year return on the S&P 500 index	Duffie et al. (2007, 2009),
			Duan et al. (2012), Azizpour et al. (2018),
			Nguyen (2023)
3	D2D	Distance to Default	Duffie et al. (2007, 2009),
			Duan et al. (2012), Nguyen (2023)
4	FIRM SIZE	Logarithm of the assets	Shumway (2001), Nguyen (2023)
5	ROA	Net income to total assets	Altman (1968), Shumway (2001),
			Nguyen (2023)
6	LEVERAGE	Total liabilities to total assets	Ohlson (1980), Zmijewski (1984),
			Nguyen (2023)
7	FIRM RETURN	Trailing 1-year stock return	Shumway (2001), Duffie et al. (2007),
			Duffie et al. (2009), Bharath and Shumway (2008),
			Nguyen (2023)

Notes: The details of observable covariates used to examine and predict the frailty-correlated default model with prior distributions.

Table 2. Summary statistics of observable firm-specific attributes and macroeconomic factors.

Variable	Mean	SD	Minimum	Median	Maximum
Macroeconomics covariates
TREASURY RATE	4.6837	3.1343	0.0100	4.9500	16.300
SP500	0.1048	0.1534	−0.5542	0.1225	0.4452
Firm-specific covariates
D2D
Defaults	0.0325	1.3529	−14.1997	0.0332	4.9388
Nondefaults	1.9052	1.4412	−5.4534	1.8025	48.6861
FIRM SIZE
Defaults	20.1936	1.5811	15.1688	20.1070	26.3362
Nondefaults	21.4783	1.8422	13.5392	21.4586	27.9370
ROA
Defaults	−0.0096	0.1026	−5.3156	0.0036	4.1160
Nondefaults	0.0105	0.0449	−3.5341	0.0128	2.9270
LEVERAGE
Defaults	0.7060	0.3792	0.0000	0.6662	7.6641
Nondefaults	0.5671	0.2438	0.0000	0.5588	8.2774
FIRM RETURN
Defaults	−0.0339	1.3053	−2.8998	−0.0962	45.8583
Nondefaults	−0.0356	0.4603	−2.9045	−0.0444	5.4282

Notes: The historical default rates comprises 424,601 month observations between January 1980 and June 2019.

Table 3. Estimation results of default intensity with non-informative prior distribution.

Predictor	Coefficient	Asymptotic	t-Statistic	95% Confidence Interval
Predictor	Coefficient	Standard Error	t-Statistic	Lower Bound	Upper Bound
Macroeconomics covariates:
Constant	−3.1263	0.7673	−4.07	−4.6303	−1.6223
TREASURY RATE	−0.1231	0.0231	−5.33	−0.1685	−0.0777
SP500	−0.9093	0.2832	−3.21	−1.4645	−0.3540
Firm-specific covariates:
D2D	−0.6099	0.0202	−30.19	−0.6496	−0.5703
FIRM SIZE	−0.1838	0.0355	−5.18	−0.2535	−0.1142
ROA	−0.3691	0.0941	−3.92	−0.5536	−0.1846
LEVERAGE	0.5293	0.0462	11.46	0.4387	0.6200
FIRM RETURN	−1.1282	0.0825	−13.67	−1.2900	−0.9663
Frailty effect:
Hidden-factor volatility	0.1096	0.0061	17.97	0.0975	0.1216
Hidden-factor mean reversion	0.4360	0.0546	7.98	0.3288	0.5432
Brownian motion volatility	8.4610	0.3394	24.93	7.7956	9.1264
No. of firm-month observations	424,601
Log-likelihood	−2379.61

Notes: Asymptotic standard errors of the estimated parameters are computed using the Hessian matrix.

Table 4. Estimates of the frailty-correlated model with subjective prior distribution.

Predictor	Coefficient	Asymptotic	t-Statistic	95% Confidence Interval
Predictor	Coefficient	Standard Error	t-Statistic	Lower Bound	Upper Bound
Macroeconomics covariates:
Constant	−3.4556	0.6259	−5.52	−4.6825	−2.2288
TREASURY	−0.1175	0.0184	−6.37	−0.1536	−0.0813
SP500	−1.0620	0.2306	−4.60	−1.5142	−0.6099
Firm-specific covariates:
D2D	−0.6309	0.0169	−37.30	−0.6641	−0.5977
FIRM SIZE	−0.1856	0.0289	−6.42	−0.2422	−0.1290
ROA	−0.3807	0.0780	−4.8792	−0.5336	−0.2277
LEVERAGE	0.5570	0.0380	14.6277	0.4823	0.6316
FIRM RETURN	−1.2134	0.0676	−17.93	−1.3461	−1.0808
Frailty effect:
Hidden-factor volatility	0.0897	0.0040	22.00	0.0817	0.0977
Hidden-factor mean reversion	0.6189	0.0590	10.47	0.5031	0.7347
Brownian motion volatility	12.5069	0.4354	28.7216	11.6534	13.3604
No. of firm-month observations	424,601
Log likelihood	−2202.45

Notes: Table reports the estimation result of the frailty-correlated model combined with subjective prior distribution. Asymptotic standard errors of the estimated parameters are calculated by the Hessian matrix, given

μ

and

Σ

below.

Table 5. Prediction accuracy for frailty-correlated default model with different prior distribution.

	T (Year)	2012	2013	2014	2015	2016	2017	2018	Average
	1	85.71	100.00	100.00	87.50	91.67	100.00	100.00	95.00
Uniform prior	2		77.78	83.33	80.00	89.47	93.33	87.50	85.23
	3			72.73	78.57	78.68	90.91	95.00	83.18
	1	85.71	100.00	100.00	93.33	93.33	100.00	100.00	96.05
Subjective prior	2		78.68	87.50	80.00	90.91	93.33	87.50	86.32
	3			75.00	78.57	83.33	91.67	95.00	84.71

Notes: The table reports the accuracy ratios for the out-of-sample prediction for different prediction horizons. In particular, individual accuracy ratios and average accuracy ratios for the model with uniform and subjective prior distributions over three default prediction horizons (2012–2018, 2013–2018, and 2014–2018) are presented.

Table 6. Comparison of default prediction accuracy between the Logistic Regression method and the Particle MCMC Expectation-Maximization method.

	T (Year)	2012	2013	2014	2015	2016	2017	2018	Average
	1	80.00	75.00	50.00	80.00	90.00	100.00	100.00	82.14
Logistic Regression	2		66.67	66.67	57.14	78.57	90.91	80.00	73.33
	3			62.50	54.55	64.28	80.00	93.33	70.93
Particle MCMC	1	85.71	100.00	100.00	93.33	93.33	100.00	100.00	96.05
Expectation-Maximization	2		78.68	87.50	80.00	90.91	93.33	87.50	86.32
with subjective prior	3			75.00	78.57	83.33	91.67	95.00	84.71

Notes: The table reports the accuracy ratios for the out-of-sample default prediction at 1 year, 2 years, and 3 years using the Logistic Regression method and Particle MCMC Expectation-Maximization with subjective prior method.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nguyen, H. Particle MCMC in Forecasting Frailty-Correlated Default Models with Expert Opinion. J. Risk Financial Manag. 2023, 16, 334. https://doi.org/10.3390/jrfm16070334

AMA Style

Nguyen H. Particle MCMC in Forecasting Frailty-Correlated Default Models with Expert Opinion. Journal of Risk and Financial Management. 2023; 16(7):334. https://doi.org/10.3390/jrfm16070334

Chicago/Turabian Style

Nguyen, Ha. 2023. "Particle MCMC in Forecasting Frailty-Correlated Default Models with Expert Opinion" Journal of Risk and Financial Management 16, no. 7: 334. https://doi.org/10.3390/jrfm16070334

APA Style

Nguyen, H. (2023). Particle MCMC in Forecasting Frailty-Correlated Default Models with Expert Opinion. Journal of Risk and Financial Management, 16(7), 334. https://doi.org/10.3390/jrfm16070334

Article Menu

Particle MCMC in Forecasting Frailty-Correlated Default Models with Expert Opinion

Abstract

1. Introduction

2. Econometric Model

3. Major Results

3.1. Data Sample

3.2. The Choice of Covariates

3.3. Parameter Estimates

4. Out-of-Sample Performance and Robustness Check

5. Concluding Remarks and Limitations

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Note

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI