1. Introduction
Started in December 2019, the coronavirus pandemic 2019 (COVID-19) has created not only a global health crisis but also significant turmoil in financial markets around the world. One noteworthy event was the crash of oil prices on 20 April 2020. For the first time in history, the front-month May 2020 West Texas Intermediate Crude Oil contract settled at negative
$37.63 a barrel on the New York Mercantile Exchange. The oil price has gone up after the crash, but not yet recovered to the pre-pandemic level. As of 15 December 2021, the United States Oil Fund, a representative indicator of the oil price, closed at
$51.62. In comparison, the USO ranged between
$76 and
$105 during 2019. On the other hand, the COVID-19 has killed 826,364 people in the US as of 15 December 2021, with corresponding COVID-19 active cases of 10,138,661 (The COVID-19 related data is collected from Our World in Data:
https://ourworldindata.org/coronavirus-data (accessed on 30 December 2021)). While there is no clear evidence on when the COVID-19 would end, it is crucial for financial market participants to reexamine their trading and valuation models and assess the impacts of COVID-19 quantitatively.
This paper investigates the effects of COVID-19 on energy market from one specific perspective—the performances of the pairs trading portfolios. Pairs trading is a statistical arbitrage trading strategy based on assets with similar characteristics but dissimilar valuation dynamics. First developed at Morgan Stanley during the 1980s, pairs trading aims at finding relative mispricing between securities and profiting from the convergence. The energy market provides an ideal group of candidates for pairs trading purpose. Energy-related securities, such as energy stocks, futures, Exchange-Traded Funds (ETFs), are ‘similar’ in the sense that their valuations are closely related to common fundamentals in the energy market. For example, ref. [
1] reports that energy futures with different maturities are highly correlated. However, relative mispricings are possible due to different features and segments of the securities. Since COVID-19 has brought significant turbulence to the energy market, and energy-related securities are ideal for pairs trading, we are interested in studying the impact of COVID-19 on the energy market by studying the performances of pairs trading portfolios.
Using daily data from 1 January 2015 to 5 December 2021, we construct arbitrage portfolios using five energy futures, five largest and five smallest energy stocks, and five energy-related ETFs, for a total of 190 possible pairs. A parametric approach, which is known as pairs trading, is adopted to form the portfolios. It assumes that a linear combination of a pair of securities is stationary, and the deviation from the long-run (theoretical) equilibrium is temporary. The deviation is called the spread, which is the relative mispricing that the pairs trading strategy tries to capture. Profit can be generated if the spread converges to zero fast enough. To implement the model, we split the data of interest into two sets: the in-sample set, from which we calibrate the model parameters, and the out-of-sample set, on which we test the model performance. For each pair, we compute its realized returns, standard deviations, Sharpe ratios and standard deviations of the Sharpe ratios, for both the in-sample and out-of-sample sets. For comparative purpose, we further divide the whole data sample into two sub-sample periods, using 23 January 2021 (the date that the first case of COVID-19 in US was confirmed) as the break point, and conduct the same analysis for each sub-sample. The detail of the data split and data sub-sampling is summarized in
Table 1. By doing so, we can compare the performances before and after the pandemics began, and how the spread of COVID-19 shapes the landscapes of arbitrage-based trading models.
An emerging literature is devoted to understand the impact of COVID-19 on global economy and financial markets. See, for example, refs. [
2,
3,
4,
5,
6,
7,
8] study the adverse impacts of COVID-19 on stock markets from various perspectives. In terms of the energy market, recent scholarly works have provided practical insights for interested market participants, and this paper joins in this line of efforts. Among them, ref. [
9] investigates the nexus between pandemic, political risk, equity and energy markets; ref. [
10] analyses the natural gas and electricity markets in Spain while ref. [
11] focuses on the energy market in Italy. ref. [
12] discusses the challenges posed by COVID-19 to the energy sector. ref. [
13] studies the impact of COVID-19 on energy market volatility. refs. [
3,
4,
14] examines the effects of the pandemic in an equilibrium framework where asset pricing implications are drawn. ref. [
15] derives the optimal investment decision for electricity producers during a pandemic. The trading model in this paper follows from refs. [
16,
17,
18,
19].
This paper fills a gap in the literature as the first attempt to study the impact of COVID-19 on the energy market from the perspectives of arbitrage trading. What’s more, our analysis spans a wide range of asset classes that are relevant for energy market participants. Previous literature on pairs trading in the energy market focuses on a single asset class. For example, ref. [
20] studies the energy futures market and ref. [
21] surveys energy stocks. Based on our knowledge, we are the first to combine energy futures, selected stocks and ETFs in a pairs trading setting. The results in this paper are of immediate usefulness to practitioners. On one hand, our results could serve as a benchmark for other trading strategies based on statistical arbitrage theory. Indeed, pairs trading is the cornerstone of various modern statistical arbitrage approaches. It is based on the mean-reverting property of the spread, or relative price difference between two securities. The idea can be easily extended to cases with multiple securities. Such approach is usually called basket trading in the financial industry but it builds on the basic case where a pair of two securities are formed and studied, exactly the task that we achieve in the present paper. On the other hand, the main conclusion in this paper is preliminary: it implies that the conventional strategy, which once worked, breaks down during the COVID-19 pandemic and more advanced trading models are called for.
The rest of the paper is organized as follows.
Section 2 introduces the portfolio construction process and assumptions.
Section 3 gives an overview of the data and descriptive statistics.
Section 4 reports the in-sample and out-of-sample results for the whole sample and each of the sub-samples.
Section 5 concludes.
3. Data and Descriptive Statistics
We choose five energy futures: BZ, CL, HO, NG and RB; five largest energy equities: XOM, CVX, RDS-B, RDS-A, and PTR; five smallest energy equities: MTR, CRT, NRT, MVO, and FET; and five energy related ETF: XLE, VDE, XOP, IYE, and OIH. Since there are 20 energy-related securities under study, and two securities are needed to form a trading pair, the total number of possible pairs is
. We compute the return of all possible 190 pairs under the trading strategy and the entry-exit rule described in
Section 2 given the actual daily prices between 1 January 2015 and 5 December 2021. To compare the performances and to determine the overall effects of COVID-19 on pairs trading, we split the whole data period into two sub-samples, using 23 January 2020 as the dividing point. For each sub-samples, we further separate the data set into the in-sample (training) set and the out-of-sample (test) set. Specifically, for the full period, we treat the data between 1 January 2015 and 22 January 2020 as the in-sample set, and the data between 23 January 2020 and 5 December 2021 as the out-of-sample set. The rationale behind such separation is that we would like to see whether a constant set of model parameters calibrated before the pandemic performs similarly during the pandemic. For sub-sample 1, we treat the data between 1 January 2018 and 1 January 2019 as the in-sample data, and the data between 1 January 2019 and 22 January 2020 as the out-of-sample data; for sub-sample 2, we treat the data between 23 January 2020 and 22 January 2021 as the in-sample data, and the data between 23 January 2021 and 5 December 2021 as the out-of-sample data. Time series of selected futures, equities and ETFs are plotted in
Figure 2.
Pairs trading is implemented by finding two stocks whose prices move together historically to make sure that the resulting spread will be in the direction of mean reversion. Such two stocks are considered ’similar’ in the trading sense. In practice, the correlation coefficient is used to measure the ‘similarity’ between two stocks. In this regard, we construct heat maps to visualize the general degree of ’similarities’ between asset returns in an asset pool.
In
Figure 3,
Figure 4 and
Figure 5, using heat maps, we plot the correlation matrix among the twenty energy-related securities for different date and sample periods.
Figure 3 suggests that the correlation structure changes after COVID-19 began, as panel (b) is generally darker than panel (a), implying a higher level of co-movement between assets in the energy sector during the pandemic. In contrast,
Figure 4a is similar to
Figure 4b. This again motives us to study the COVID-19’s effect on pairs trading performance, since the change in correlation structure may result in the breaking down of previously established patterns in arbitrage-based trading.
4. Results
The results of the pairs trading on all 190 pairs are summarized in
Figure 6,
Figure 7 and
Figure 8 for different sample periods, and the main statistics are reported in
Table 2.
To give the readers an intuitive comparison of the out-of-sample and in-sample performances among various periods, we plot the Sharpe ratios against the average return of all the 190 pairs. In
Figure 6 and
Figure 8, the out-of-sample performances are clearly worse than the in-sample performances, however, in
Figure 7, the out-of-sample performances are consistent with those of the in-sample period.
Examining the results, three observations are drawn. First, by comparing the means and standard deviations of out-of-sample returns in
Table 2, it is obvious that out-of-sample performance of the first sub-sample data (with mean of return being 0.0692, and mean of the Sharpe ratio being 0.2869) is much better than those of the full sample period and the second sub-sample period with mean of return being 0.0108, and mean of the Sharpe ratio being 0.0763 for the full sample and with mean of return being 0.02, and mean of the Sharpe ratio being 0.1133 for the second sub-sample). Because in the first sub-sample period, the out-of-sample data is before the pandemic, and in other cases, the out-of-sample data is after the pandemic, the above comparison implies that the COVID-19 pandemic has exerted a strong negative impact on the performance of pairs trading in the energy-related financial market.
Second, the in-sample performance of the second sub-sample was outstanding; however, the corresponding out-of-sample performance was poor. This is because the pandemic shock has had a huge impact on the stock price volatility, and brought statistical arbitrage opportunity to some energy related futures or equities. However, such opportunities only exist in the in-sample data, and cannot be carried to the out-of-sample period. Also, the impact of COVID-19 event is not consistent among the twenty securities in our pool. The standard deviation of the return of the pairs trading among the 190 pairs is 3.6687, and the standard deviation of the corresponding Sharpe ratios is 3.1260.
Third, the performance of the second sub-sample is slightly better than that of the whole sample, in terms of both the mean of the return and the mean of the Sharpe ratios (as shown in the forth and second row in
Table 2, 0.200 vs. 0.0108 for the return; and 0.1133 vs. 0.0763 for the Sharpe ratios). This means that the pairs identified using the data before the pandemic as training data no longer work, and the pairs based on the data after the pandemic perform better. Similar to our earlier finding, this result shows that the COVID-19 pandemic brought a structural change to the energy market and impacted the performance of pairs trading.
Also, we implement the Wilcoxon–Mann–Whitney test to compare the level differences of out-of-sample performance during different sampling periods. We are interested in knowing whether such differences across data samples are statistically significant. For this purpose, we compare the distributions of either the returns or the Sharpe ratios for the full sample (or the sub-sample 2) to those for the sub-sample 1. The steps of the Wilcoxon–Mann–Whitney test are outlined below:
Let the distribution of the strategy returns for the full sample be
, and the corresponding distribution for the sub-sample 1 be
. In the Wilcoxon–Mann–Whitney test, the null hypothesis,
, is:
and the alternative hypothesis,
, is
That is, we want to test whether the distribution of the returns for the sub-sample 1 is a location shift to the right of the corresponding distribution of the returns for the full sample.
The p-value of the test is computed.
A similar test is conducted to compare the distributions of the Sharpe ratios.
The same procedure is repeated to compare the distributions of returns and Sharpe Ratios between sub-sample 2 and sub-sample 1.
The
p-values of the above tests are reported in
Table 3. From this table, we conclude that the performance of pairs trading based on the sub-sample 1 is better than the full sample or the sub-sample 2. The results from the Wilcoxon–Mann–Whitney test imply that the out-of-sample performance of pairs trading has declined on a statistically significant sense during the COVID-19 pandemic.
Finally, we report the ratio of the in-sample performance with respect to the out-of-sample performance for the three data periods in
Figure 9,
Figure 10 and
Figure 11. The ratio in these figures is defined as:
where
is the return or Sharpe ratio of the in-sample set for the
i-th pair, and
is the corresponding value of the out-of-sample set for the
i-th pair. It is clear that in
Figure 9 and
Figure 10, most of the points are above the 45 degree line, while in
Figure 11, most of the points are below the 45 degree line. This pattern implies that the pandemic increased the return volatility of energy-related securities, and made pairs trading more risky when the training data included the time period after the pandemic started.
5. Conclusions
In this paper, we have investigated one aspect of the COVID-19 effects on the energy market. By constructing pairs trading portfolios from a representative group of energy-related securities and examining their performances, we can draw a few important implications.
First, we find that the performance of the strategy degenerated sharply in the face of COVID-19, as shown by the much lower out-of-sample average Sharpe ratios, lower average realized returns and higher variation of Sharpe ratios. The huge gap between the in-sample and out-of-sample performance measures suggests that simply recycling parameters does not generate sound results in the pandemic era. Second, the same strategy works much better if only the pre-pandemic data (sub-sample period 1) is studied. Performance measures improve: higher average Sharpe Ratios and lower variation of Sharpe ratios. It indicates that the arbitrage trading strategy works before the COVID-19 emerged, and the pandemic completely reshapes the strategy. Third, using data after the pandemic started (sub-period 2) only sightly improves the performance. Out-of-sample results are still poor after controlling sub-samples, implying that the effect of COVID-19 is adverse or even destructive. Besides, two other implications are especially important for practitioners. First, increasing the asset span, as we do in the present paper, does not necessarily improve the results. Second, a heat map might serve as a good graphical indicator and an intuitive visual tool in the preparatory analysis for arbitrage-based trading strategies. Detailed usage and a more deterministic link between heat maps and strategy performance may warrant further examination for the interested parties.
The findings in this paper point out the importance of model innovations or modifications during the pandemic. Future research could be taken in multiple directions. For example, we can allow for time-varying parameters in the baseline model, or relax the model with more general set-ups. ref. [
30] proposes a pairs trading algorithm based on general state spaces. They find that both the Sharpe ratios and realized returns are improved by applying their framework to the equity market. Further investigations can be performed by following this line of research by viewing COVID-19 as a change in the model’s state space. We leave this for future investigation.