1. Introduction
Ever since the advent of the first digital currency, Bitcoin (BTC), which was created under the anonymous identity
Nakamoto (
2008), cryptocurrencies are becoming increasingly popular among investors as they provide an alternative investment strategy to conventional financial assets. Unlike fiat money, which is usually issued by central authorities, the cryptocurrencies typically originate from decentralized virtual networks that store and distribute assets digitally. The cryptocurrencies are highly sought after as they stem from a peer-to-peer structure that allows efficiency, security and profitability. As they are becoming more widely accepted by investors, it has cemented their legitimacy as a medium of exchange and a means to liquidity for businesses, financial institutions and the general public alike.
In this paper, we concentrate on investigating the returns of the top seven cryptocurrencies by market capitalization in USD as of August 8, 2020, according to coinmarketcap.com. The top seven cryptocurrencies are Bitcoin (BTC), Ethereum (ETH), Ripple (XRP), Tether (USDT), Bitcoin Cash (BCH), Bitcoin SV (BSV), and Litecoin (LTC). Not only do we aim to uncover the relationship between the cryptocurrency and the macroeconomic conditions, but we also explore the dynamic interactions between the cryptocurrency market and the financial market, foreign exchange (Forex) market, commodity market and precious metals market. This helps potential investors to gain more insights into the leading cryptocurrencies.
The relationship between the macroeconomic conditions, the conventional assets market, and the cryptocurrencies’ returns have been extensively investigated in the current literature. For example,
Dyhrberg (
2016a),
Dyhrberg (
2016b) and
Wu et al. (
2019) showed that the gold prices help explain the variations in the cryptocurrencies. Moreover,
Bouri et al. (
2017),
Demir et al. (
2018),
Panagiotidis et al. (
2019) and
Wu et al. (
2019) provided evidence that the economic policy uncertainty plays a role in explaining the cryptocurrencies. However, there is not yet a unifying underlying theory that guides the modelling of the cryptocurrencies. Moreover, the decentralized and volatile nature of the cryptocurrency, which is supported by
Aharon and Qadan (
2019), further aggravates the problem of model uncertainty that is already ripe in the field of asset pricing.
To handle model uncertainty, model selection and model averaging have long been the competing approaches that originate from dichotomous modelling philosophies. Model selection searches for the most relevant variables, while model averaging aims to smooth over a set of candidate models rather than committing to a single model. For model selection,
Hoerl and Kennard (
1970) and
Tibshirani (
1996) introduced the ridge and the least absolute shrinkage and selection operator (LASSO) estimator for simultaneous selection and estimation. In contrast, the traditional best subsets approach often leads to local solutions.
However, the ridge and LASSO estimators’ asymptotic bias prompted the development of a class of penalized least squares estimators that also possess the oracle properties. In particular,
Fan and Li (
2001) developed the smoothly clipped absolute deviation (SCAD) estimator,
Zou (
2006) introduced the adaptive LASSO (AdaLASSO) estimator, and
Zhang (
2010) invented the minimax concave penalty (MCP) estimator.
The penalized estimators’ performance also relies on the tuning parameter choice for the penalty function to achieve the optimal estimation. The cross-validation (CV) and the information criterion (IC) are the two conventional approaches for selecting the tuning parameter.
Shi and Tsai (
2002) have shown that the Bayesian information criterion (BIC) consistently identifies the true model in finite samples. For the algorithm generating the candidate tuning parameters,
Tibshirani et al. (
2010) and
Breheny and Huang (
2011) developed the cyclical coordinate descent (CCD) algorithm to compute the solution path for the penalized least squares estimators. They provided evidence that the tuning parameter choice affects the model selection outcomes.
In the spirit of the OLS post-LASSO estimator introduced by
Belloni and Chernozhukov (
2013),
Xiao and Sun (
2019) further studied the finite sample performances of the class of OLS post-selection estimators with the tuning parameters determined by different tuning parameter selection approaches. The OLS post-selection estimators avoid the complicated penalty functions in building inference, and the results from
Xiao and Sun (
2019) further validate the importance of the tuning parameter selection.
On the other hand, for model averaging, the Shrinkage Mallows Model Averaging (SMMA) estimator introduced by
Xiao and Sun (
2019) compliments the penalized estimators by allowing for more than one model selection outcome and hedges against the penalized estimators’ sensitivity towards the tuning parameter choice. The SMMA estimator extends the current Mallows Model Averaging (MMA) framework from
Hansen (
2007) by introducing a reasonable way for selecting candidate models. The SMMA estimator also builds on the shrinkage averaging estimator (SAE) by
Schomaker (
2012) by incorporating the tuning parameter optimization problem for each candidate model. The SMMA estimator essentially combines model selection and model averaging to address model uncertainty.
In this paper, we examine the determinants of the returns for major cryptocurrencies by applying the autoregressive distributed lag (ARDL) model with model selection via penalized least squares estimators. Then we average the candidate sub-models post model selection by the SMMA estimator for forecasting cryptocurrencies’ returns. In particular, the issue of ARDL(p,q) models’ lag selection, the discrete nature of traditional model selection algorithms, and the sensitivity of the tuning parameter choice affecting variable selection outcomes culminate in a call for a novel modelling approach. Therefore, we contribute to the existing literature on forecasting cryptocurrencies via a novel model averaging approach where both model selection and model averaging are combined by our SMMA estimator to address the cryptocurrencies’ model uncertainty.
We find that most cryptocurrencies’ returns are sensitive to fluctuations from major financial markets. They are also susceptible to the changes in gold prices and the Forex market’s current and lagged information. We then forecast the returns of the cryptocurrencies via the SMMA estimator. We find that an ARDL(p,q) model estimated by the SMMA estimator outperforms competing estimators and competing models considered in this paper for forecasting cryptocurrencies’ returns.
2. Methodology
To handle the model uncertainty that is common in asset pricing, especially for the cryptocurrencies, we consider both model selection and model averaging approaches. For model selection, we mainly consider the penalized least squares estimators studied in detail by
Xiao and Sun (
2019). According to their data examples, the six best-performing OLS post-selection estimators are listed in
Table 1. They provide not only an estimation strategy, but more importantly, an appropriate data-driven approach for the ARDL(
p,
q) model’s lag selection as there is not yet a no unified approach for its lag selection.
For
Table 1, the OLS post-SCAD(BIC) estimator is constructed with the tuning parameter in the SCAD penalty function selected by the BIC. Similarly, the OLS post-SCAD(GCV) is constructed with the tuning parameter in the SCAD penalty function selected by the generalized cross validation (GCV).
For example, an OLS post-SCAD(BIC) estimator where the tuning parameter in the SCAD penalty function is selected by the BIC for the ARDL(
p,
q) model in Equation (
23) can be constructed with the tuning parameter in the penalty function selected by the BIC approach as follows.
Let
be the set of candidate tuning parameters and
with
. Given any
, the SCAD estimator evaluated at
gives:
where
F is the SCAD penalty function.
The BIC evaluated at this
is defined as
, which is given by:
where the values for
originate from an exponentially decaying grid as in
Tibshirani et al. (
2010).
Let
denote the set of nonzero parameters of the model when evaluated at
so that
. Then,
gives the number of nonzero parameters of the model when evaluated at
and
following
Shi and Tsai (
2002).
The estimate of the optimal tuning parameter is denoted by
and solves the following problem:
Consequently,
minimizes the SCAD penalized objective function given by Equation (
1),
Denoting
, we define the OLS post-SCAD(BIC) estimator as:
where
is an
vector, which is the
vth column of the predictor matrix
Z, and
is the
vth parameter in
.
Similarly, we construct other OLS post-selection estimators shown in
Table 1 as the OLS post-selection(BIC or GCV) estimator. The OLS post-selection(BIC or GCV) estimator minimizes the BIC or the GCV criterion, respectively, to estimate the optimal tuning parameter.
For model averaging, we adopt a novel model averaging approach or the SMMA estimator introduced by
Xiao and Sun (
2019) as it combines model selection and model averaging to handle model uncertainty. The SMMA estimator performs significantly better when averaging high dimensional sparse models against model uncertainty.
The SMMA estimator starts with a large general model and applies different penalty methods to select the candidate models for averaging. It is a two-stage estimator. For this paper, in the first stage, we apply different penalty estimators introduced in
Table 1 with optimal tuning parameters selected via the GCV or BIC, thereby obtaining a sequence of candidate models. Then we apply the MMA criterion in the second stage to estimate the model parameters.
Let
represent the set of optimal tuning parameters selected either by the BIC or GCV for the model selection procedures introduced in
Table 1 such that:
where
represents the optimal tuning parameter for the
sth candidate model given by the
penalized least squares estimator, and
.
The model averaging estimator for the conditional mean of
y is
, which is defined as:
where the projection matrix for the
sth candidate model is defined as:
and the estimator for the parameters in the
sth candidate model is given by:
Therefore, the SMMA estimator for the ARDL(
p,
q) model is solved as:
where
is the number of nonzero values in
and
is a
matrix specified as:
whose rank is
, and each element in
is either one or zero to map
to a
vector, while
Z is the
regressor matrix.
The weight vector
w is estimated by the MMA criterion,
where
is a weight vector in the unit simplex in
with
such that:
and the effective number of parameters
is defined as:
Let the
L index be the largest model in dimension from the set of the candidate models, i.e.,
and
be the number of nonzero parameters in the largest candidate model.
According to
Hansen (
2007), the
term will be estimated by
, which is given below:
Following
Hansen (
2008), an out-of-sample forecast combination by the SMMA estimator is generated as:
where
is the forecast from the
sth candidate model given
, and it is defined as:
where
is the out-of-sample regressor matrix that is
in dimension.
4. Data
First, the dataset contains the daily close prices for the top seven cryptocurrencies by market capitalization in USD as of 8 August 2020 according to coinmarketcap.com, which is website that specializes in cryptocurrencies. The top seven cryptocurrencies are Bitcoin (BTC), Ethereum (ETH), Ripple (XRP), Tether (USDT), Bitcoin Cash (BCH), Bitcoin SV (BSV) and Litecoin (LTC). The cryptocurrencies studied in the paper are summarized in
Table 2.
Table 2 presents the codes, release dates, market capitalizations (USD), rankings and sample sizes for the cryptocurrencies considered in this paper. It is worth noting that Ripple commands a relatively higher market capitalization than USDT and BCH despite a later release date. The USDT and BCH were released earlier in 2015 and 2017, respectively.
Following the literature, the returns for the cryptocurrency are defined as,
where
stands for the daily close prices for respective cryptocurrencies at period
t. To explain the cryptocurrency returns, we also incorporate several other independent variables into the dataset that comprise macroeconomic and financial market conditions.
To account for the macroeconomic conditions, we include the following independent variables. The independent variables are the daily Brent crude oil price following
Aharon and Qadan (
2019), the daily London Bullion Market Association (LBMA) gold price following
Dyhrberg (
2016a),
Dyhrberg (
2016b) and
Wu et al. (
2019). We also include U.S./Euro foreign exchange rate, the daily US federal funds rate, the daily economic policy uncertainty index for the US, and the monthly global economic policy uncertainty index following
Bouri et al. (
2017),
Demir et al. (
2018) and
Panagiotidis et al. (
2019). The daily time series data for the oil price, gold price, U.S./Euro foreign exchange rate and the daily US federal funds rate (FFR) are collected from
quandl.com. The daily time series data for the US economic policy uncertainty index and the monthly time series data for the global economic policy uncertainty index are collected from
policyuncertainty.com.
To control for the financial market conditions, we also introduce the Dow Jones NYSE index, the S&P 500 index, the Volatility Index (VIX), which is the daily Chicago Board Options Exchange (CBOE) volatility index in U.S. dollars, and Fama and French’s Momentum Factor (Mom) following
Aharon and Qadan (
2019). The daily time series data for the Dow Jones NYSE index and S&P 500 index are from Yahoo finance. The daily time series data for VIX are collected from the CBOE’s official website at
cboe.com, and the daily time series data for Mom are collected from the Kenneth R. French Data Library.
Similarly, the daily returns for the independent variables in this dataset are defined following Equation (
24). The descriptive statistics for the dataset are shown in
Table 3.
We further conduct the normality test and the unit root test for the daily return series in the dataset, and the results are given in
Table 4.
For the normality test, we presented
p-values for the dataset variables in
Table 4. We conducted the normality test to test if there is non-normality as it is common in financial datasets. To handle such non-normality in data, we have increased the sample size as large as possible to take advantage of the central limit theorem. As shown above, our cryptocurrencies sample starts from the day the said cryptocurrencies are openly traded till 8 August 2020, when this paper is written.
For the unit root test, we presented
p-values for the variables in the dataset in
Table 4. The unit root test is important as it tests whether a time series is stationary. We usually conduct unit root tests for the time series data first to avoid spurious regressions. The unit root test for FFR is omitted since the variable of interest is the FFR in the levels, rather than the first-order differenced FFR. We use the FFR in the levels in our subsequent regression models.
5. Empirical Results
To determine the optimal order of lags for the ARDL(
p,
q) model for each cryptocurrency listed in
Table 2, we shall apply the penalized least squares (PLS) estimators due to the lack of a unifying approach for the ARDL(
p,
q) model’s lag selection and the discrete nature of the traditional model selection algorithms. In contrast, the PLS estimators provide a more holistic approach for simultaneous lag selection and parameter estimation.
Since the true data generating process (DGP) is not observed, we will also consider competing estimators and competing models. The competing estimators for estimating the ARDL(
p,
q) model are the Naive OLS, all of the penalized least squares estimators in
Table 1, and the SMMA estimator.
To determine which model better fits the data, the competing models are the traditional asset pricing model (APM) , the constant model , the artificial neural network (ANN) model with activation function and the back-propagation neural network (BPNN) model.
To evaluate the performances, we compare the respective mean squared prediction error (MSPE), adjusted
, and model size (MS) that gives the number of non-zero estimated parameters. MSPE measures the fit and adjusted
gauges the performance in identifying the most relevant regressors, and MS measures parsimony and efficiency. The evaluation is presented in
Table 5.
As shown in
Table 5, the best performing penalized least squares estimators for each cryptocurrency are selected based on MSPE and adjusted
. The respective penalized least squares estimators and the SMMA estimator outperforms the naive OLS terms of parsimony while yielding similar MSPE and adjusted
.
For the residual diagnostics shown in
Table 6, we conducted the Breusch–Godfrey test up to the order of three for detecting any remaining serial autocorrelation in the model residuals. We found no serial autocorrelation in the model residuals for all the models considered in
Table 6. This further supports the application of the ARDL(
p,
q) model, and we conclude that based on the empirical evidence, there is no need for us to further consider the class of ARMA models.
The ARDL(
p,
q) model estimated by penalized least squares estimators and the SMMA estimator outperforms the competing models in MSPE. This supports applying the ARDL(
p,
q) model and our estimation strategies to better explain the cryptocurrencies’ returns. But the ARDL(
p,
q) model estimated by penalized least squares estimators outperforms the SMMA estimator in MS. Therefore, to explain the returns, we shall focus on the estimation results of the ARDL(
p,
q) model by the corresponding penalized least squares estimators for parsimony.
Table 7 presents the determinants of the returns for each cryptocurrency.
In summary, given the sparsity of the parsimonious ARDL(p,q) model after model selection, we do not consider including more lags in the ARDL(p,q) model, and the true ARDL(p,q) model is likely sparse as well. We then examine the returns determinants for each cryptocurrency, respectively.
For the BTC’s daily returns, daily returns from the S&P 500 and the gold market have positive effects. This finding is consistent with
Aharon and Qadan (
2019), who suggested that the BTC’s daily returns are positively correlated with the S&P 500 and Gold. This indicates that the BTC market responds promptly to the performances of the S&P 500 index and the gold market.
In comparison, daily returns from the NYSE and the gold market have positive effects on the ETH’s daily returns. This finding is supported by
Deniz and Stengos (
2020), who also discovered a positive correlation between the ETH market and the NYSE and the gold market.
Lagged XRP returns, NYSE and the Mom introduced by
Fama and French (
1993), have positive effects on the current daily returns of the XRP. This finding coincides with
Deniz and Stengos (
2020), where a positive correlation between the XRP market and the NYSE index is also discovered.
However, the FFR has a negative effect on the XRP’s daily returns, which suggests that the XRP is sensitive to the price of liquidity. Therefore, as the liquidity becomes cheaper, investors divest from the XRP for other assets.
Deniz and Stengos (
2020) also uncovered the negative relationship between the U.S. federal funds market and the XRP market.
For the USDT, its lagged returns, NYSE and the Mom negatively affect the current daily returns. The Euro, which represents the daily percentage change from the foreign exchange rate, and the lagged Euro negatively affect the USDT daily returns. It suggests that the USDT is more exposed to the volatility in the Forex market. Based on the estimation results, the USDT could provide an investment strategy that hedges the risk exposures from the XRP market as they react contrarily to the fluctuations in financial markets.
The BCH’s daily returns rely heavily on the Mom factor, and the Mom has a positive effect on the current daily returns. As a derivative cryptocurrency from the original cryptocurrency Bitcoin, the BCH market is much more speculative based on this evidence. BCH investors tend to focus on the financial markets ’ momentum as the central reference point for developing investment strategies.
However, for the BSV, which is also a derivative cryptocurrency from the original cryptocurrency Bitcoin, its current daily returns are strongly correlated with its lagged returns. Besides, BSV heavily relies on the daily returns from the gold as the primary returns determinant. It suggests that the BSV market is more closely connected to the gold market, and it provides similar risk-hedging investment strategies to the gold.
For the LTC market, contemporaneous returns from Gold, Mom and Euro have positive effects on the LTC’s contemporaneous daily returns. In contrast, the lagged returns from S&P 500, Mom and Euro have negative effects. This finding is also consistent with
Deniz and Stengos (
2020). It suggests that the LTC market is contemporaneously positively correlated with the gold market, the Forex market, and the financial market’s momentum while being negatively affected by the lagged information from the markets above. The LTC market closely reacts to the fluctuations in the gold market and the Forex market, and the momentum of financial market.
Overall, most of the cryptocurrencies (BTC, ETH, BSV and LTC) in
Table 7 are sensitive to changes in the gold market as they provide a similar investment strategy to hedge against the risks from the real economy. In this aspect, the cryptocurrencies above provide similar functionalities to the precious metals market.
Besides Gold, NYSE, Mom, and Euro are also major deciding factors for the cryptocurrencies’ daily returns. In particular, ETH, XRP, USDT, BCH and LTC are sensitive to the changes and the momentum in financial markets. It suggests these cryptocurrencies show more extensive exposures to the fluctuations in conventional financial assets. Furthermore, the dependence on the lagged returns for cryptocurrencies such as the XRP, USDT and BSV suggests that their returns follow an autoregressive process so that market memories from the lagged market information are a crucial factor to consider when modelling for their daily returns.
6. Forecast Evaluation
We generate one-step-ahead forecasts for the daily returns of each cryptocurrency by the recursive forecasting approach. Following
Hansen (
2008) and
Xiao and Sun (
2019), forecasts for the cryptocurrencies’ daily returns from the ARDL(
p,
q) model are averaged by the SMMA estimator. Specifically, we forecast the next day’s daily returns based on the market information accumulated up to the previous day.
To facilitate the forecast evaluation, we compare the ARDL model’s forecasting performance when estimated by the SMMA estimator against the aforementioned competing estimators and competing models. The competing models are the traditional asset pricing model (APM) , the constant model , the artificial neural network (ANN) model with activation function and the back-propagation neural network (BPNN) model.
For performance evaluation, we compare the forecast accuracy of respective forecasting approaches by the root mean squared forecast error (RMSFE), defined as:
where
is the out-of-sample forecast for
generated by respective approaches and
f represents the forecast sample size.
Similarly, the root mean absolute forecast error (RMAFE) is defined as:
We used the “Rstudio” software to conduct the forecast evaluation, and the computation took 168 h on a computer with an Intel(R) Core(TM) i7-7700HQ CPU. The forecasting performance evaluation of the competing estimators and competing models is presented in
Table 8 below.
Since the actual returns for the cryptocurrencies
y and the forecasts for the cryptocurrency returns
are already in log terms, the RMSFE and the RMAFE give the approximate average squared percentage deviations
1 and average absolute percentage deviations from the actual returns by the respective forecasting method.
It is clear from the forecasting performances that the ARDL(p,q) model estimated by the SMMA estimator consistently outperforms the competing estimators and competing models in both RMSFE and RMAFE. This result supports the application of the ARDL(p,q) model and the SMMA estimator in forecasting the cryptocurrency markets. It further highlights the importance of attending to model uncertainty in forecasting cryptocurrencies.
7. Conclusions
This paper has investigated the major cryptocurrencies’ returns determinants and introduced a novel model averaging approach, namely the SMMA estimator, for forecasting cryptocurrencies’ returns.
Model uncertainty features most asset pricing studies, and such model uncertainty proves especially challenging for cryptocurrencies in the absence of a unifying theory that should guide modelling. In particular, the issue of model’s lag selection, the discrete nature of the traditional model selection algorithms, and the sensitivity of the tuning parameter choice affecting variable selection outcomes for penalized estimators all call for a novel modelling approach.
Therefore, to handle model uncertainty, we introduced the SMMA estimator to forecast the major cryptocurrencies’ returns. The SMMA estimator combines model selection and model averaging to handle model uncertainty and outperforms when averaging high dimensional sparse models against model uncertainty. Our model averaging approach outperformed the conventional benchmark forecasting models for the cryptocurrencies in the literature.
We first investigated the cryptocurrencies’ returns determinants, we find that most cryptocurrencies are sensitive to changes in the financial markets such as the NYSE and the S&P 500, and dynamics in the financial market momentum. They are also sensitive to the changes in gold prices and current and lagged information from the Forex market.
From the forecast evaluation, it is clear that the ARDL(p,q) model estimated by the SMMA estimator consistently outperforms the competing estimators and models in RMSFE and RMAFE. The forecast evaluation result supports the application of the ARDL(p,q) model and the SMMA estimator in forecasting the cryptocurrency markets. It further highlights the importance of attending to model uncertainty in forecasting cryptocurrencies.
This paper remains limited in the following aspects. Due to the model uncertainty in cryptocurrencies, this paper could be further improved by considering more explanatory variables, especially the text-based variables that could help gauge the market sentiment for cryptocurrencies. Furthermore, more variants of machine learning approaches could be further introduced to compete against the SMMA estimator in forecasting the cryptocurrencies’ returns.