Next Article in Journal
Valuation of Goodwill for an Engineering Firm
Next Article in Special Issue
Neural Network-Based Predictive Models for Stock Market Index Forecasting
Previous Article in Journal
Exploring the Nexus of Dividend Policy, Third-Party Funds, Financial Performance, and Company Value: The Role of IT Innovation as a Moderator
Previous Article in Special Issue
Asymmetric Effects of Uncertainty and Commodity Markets on Sustainable Stock in Seven Emerging Markets
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Risk Characterization of Firms with ESG Attributes Using a Supervised Machine Learning Method

by
Prodosh Eugene Simlai
Nistler College of Business and Public Administration, University of North Dakota, 3025 University Avenue, Grand Forks, ND 58202, USA
J. Risk Financial Manag. 2024, 17(5), 211; https://doi.org/10.3390/jrfm17050211
Submission received: 22 March 2024 / Revised: 6 May 2024 / Accepted: 17 May 2024 / Published: 19 May 2024
(This article belongs to the Special Issue Financial Valuation and Econometrics)

Abstract

:
We examine the risk–return tradeoff of a portfolio of firms that have tangible environmental, social, and governance (ESG) attributes. We introduce a new type of penalized regression using the Mahalanobis distance-based method and show its usefulness using our sample of ESG firms. Our results show that ESG companies are exposed to financial state variables that capture the changes in investment opportunities. However, we find that there is no economically significant difference between the risk-adjusted returns of various ESG-rating-based portfolios and that the risk associated with a poor ESG rating portfolio is not significantly different than that of a good ESG rating portfolio. Although investors require return compensation for holding ESG stocks, the fact that the risk of a poor ESG rating portfolio is comparable to that of a good ESG rating portfolio suggests risk dimensions that go beyond ESG attributes. We further show that the new covariance-adjusted penalized regression improves the out-of-sample cross-sectional predictions of the ESG portfolio’s expected returns. Overall, our approach is pragmatic and based on the ease of an empirical appeal.

1. Introduction

The last two decades have witnessed dramatic growth in sustainable investing—an investment strategy focused on environmental, social, and governance (ESG) criteria (Bolton and Kacperczyk 2021; Starks 2023). Green stocks are companies that are supposed to have an ESG footprint, while brown stocks are companies with no such expectations (Pástor et al. 2022). According to BlackRock (2020), 88% of their clients rank the environment as “the priority most in focus.” In the presidential address of the American Finance Association, Starks (2023) argues that the motivations for investing in ESG products are either value-based or values-based. According to Starks (2023), the term ESG value relates to traditional investment goals such as risk management and return opportunities, while the term ESG values suggests that the investment motive is nonpecuniary. Although the performance of ESG companies is not a black box, little research has been completed on the risks associated with investing in green companies.
In this paper, we use firm-level monthly data from 2009 to 2022 and investigate the risk–return tradeoff of a portfolio of companies that have tangible ESG attributes. We create our sample using companies with ESG ratings from Morningstar Sustainalytics. We introduce a new type of penalized regression using the Mahalanobis distance-based covariance-adjusted method (Mahalanobis 1936) and discuss the usefulness of this covariance-adjusted shrinkage estimate using the sample of ESG firms. We show the relative performance of the resulting shrinkage estimate from a cross-sectional regression model that allows non-sparse and heteroskedastic residuals. Similar to the existing work (Bolton and Kacperczyk 2021; Safiullah et al. 2022), our results show that ESG companies are exposed to financial state variables that capture the change in investment opportunity. However, we find that there is no economically significant difference between the risk-adjusted returns of various ESG-rating-based portfolios. Furthermore, investors require return compensation for holding ESG stocks, but the risk associated with a poor ESG rating portfolio is comparable to that of a good ESG rating portfolio. We also find that the new covariance-adjusted penalized regression improves the out-of-sample performance of cross-sectional predictions of the ESG portfolio’s expected returns.
This paper is organized as follows. The following section provides the motivation and our incremental contribution with respect to the existing literature. The section after that describes the basic setup and theoretical results. The next section contains our main empirical results. In the final section, we conclude with brief comments.

2. Motivation and Our Incremental Contribution

In supervised learning, penalized regressions and the resulting shrinkage estimators are useful tools known to tackle real-world issues encountered in big data (see e.g., Athey 2018; Belloni et al. 2012; Chernozhukov et al. 2017; Kapetanios and Zikes 2018; Mullainathan and Spiess 2017; Stock and Watson 2019; and Varian 2014). The commonly used shrinkage estimators, which include Ridge regression (Hoerl and Kennard 1970), LASSO (least absolute shrinkage and selection operator) regression (Tibshirani 1996), and elastic net regression (Zou and Hastie 2005), are usually implemented under sparse and homoskedastic regression errors (Jia et al. 2013). We propose a Mahalanobis distance-based penalized regression in the presence of non-sparse and heteroskedastic residuals and show that the incorporation of covariance-adjusted shrinkage improves the out-of-sample performance. The Mahalanobis distance, first proposed by statistician P. C. Mahalanobis (1893–1972), is a well-known tool in multivariate analysis (Berrendero et al. 2020).
In many applications in finance, we come across a large number of primitive predictors ( p ) compared to the number of observations ( n ), and a regression error that is not sparse and displays heteroskedasticity with variance as a function of the primitive predictors. One such example is the cross-sectional regression model to evaluate the returns of a group of assets or portfolios by using various risk factor exposures. Examples of such equity risk factor models include the cross-sectional implementation of the capital asset pricing model (CAPM), the Fama–French five factors (Fama and French 2016), the q-factor model (Hou et al. 2015), etc. If the variability of returns of an asset or a portfolio displays cross-sectional clustering, the residuals of a risk factor regression model can be a function of the risk attributes, which can further lead to non-sparse residual covariance structure and heteroskedasticity. This paper contributes to the literature on the implications of ESG ratings in portfolio and risk management. Recent work by Giese et al. (2019) suggests that the traditional systematic risk factor contains ESG risk information, while Cohen (2023) argues that the ESG score is diminishing for US large-cap firms. In related work, Bannier et al. (2023) use a corporate social responsibility (CSR) criterion and argue that the lowest CSR-rated portfolios outperform their higher CSR-rated counterparts. Although the evidence is not always conclusive, the risk characterization of ESG stocks is valuable for portfolio managers. The results of this paper add knowledge to Giese et al. (2019) and Cohen (2023).
In terms of methodology, even though we can improve upon ordinary least squares (OLS) by generalized least squares (GLS) or other weighted least-squares alternatives, similar heteroskedasticity-corrected estimates under the shrinkage method are not readily available. There are reservations against the finite-sample properties of the GLS estimators. As an example, Angrist and Pischke (2008, sct. 3.4.1) argue that “if the conditional variance model is a poor approximation or if the estimates of it are very noisy, [the weighted] estimators may have worse finite-sample properties than unweighted estimators.” We propose an alternative method that uses Mahalanobis distance-based shrinkage in which we shrink the risk factor regression parameters using their estimated variance–covariance matrix. We show that the resulting estimate is biased, but its variance is less than the corresponding unconstrained GLS variance. Thus, when p is as large as n , we can obtain efficient estimates of the risk factor regression model parameters that can improve the credibility of the in-sample prediction model used to obtain predictions from out-of-sample observations.

3. Basic Setup and Theoretical Results

Suppose that Y = ( y 1 , . . y n ) ′ is an ( n × 1 ) response vector, X = x 1 , . . x n is the ( n × p ) matrix of covariates where x j = x 1 j , . . x n j ,   j = 1 , , p are the linearly independent predictors. Let ( Y , X ) be the transformation of the originally collected data where all the covariates are standardized to have mean 0 and variance 1 and the dependent variable is transformed to have mean 0. We assume that Y = X β + ε where ε is a vector of residuals with E ε | X = 0 ,   a n d   V ε X = Ω σ 2 I . It is well known that when p < n and the error term is heteroskedastic, the OLS estimate of β is consistent but the GLS estimate β ^ G L S is efficient and can easily be obtained using standard statistical packages with or without the explicit knowledge of the source of heteroskedasticity. When p > n , a possible alternative is to shrink the estimators by methods such as the Ridge, LASSO, and Elastic Net regressions. The shrinkage-based estimators are biased and do not have causal interpretations. Unlike OLS and GLS, both of which produce non-zero estimates to all coefficients, methods such as LASSO and Elastic Net perform automatic variable selection and produce parsimonious final models.
In the Ridge regression (Hoerl and Kennard 1970), we minimize the residual sum of squares (RSS) subject to a bound on the L2 norm (Euclidean distance) of the slope coefficients, whereas in the LASSO regression (Tibshirani 1996), we minimize the RSS subject to an L1 penalty (Manhattan distance) on the regression slope coefficients. In contrast, in the Elastic Net regression (Zou and Hastie 2005), we minimize the RSS subject to a combination of the penalizations using the L1 and L2 norms.
We propose that if the residuals display heteroskedasticity, that is, if ε | X   N ( 0 ,   Ω ) with Ω σ 2 I , we can obtain an estimate of β using the Mahalanobis distance-based shrinkage (MDS) regression. The premise of the MDS regression is to minimize the sum of RSS subject to a bound on the norm for the beta vector scaled by the variance–covariance matrix as given by the following:
β ^ M D S = argmin β ( Y X β )   Ω 1 Y X β + λ   β ( X Ω 1 X ) 1   β
where the tuning parameter λ can be selected by a cross-validation method. The MDS approach under (1) is a continuous shrinkage method that retains all the covariates but penalizes large coefficients through a modified L2 norm. The first part of the loss function in (1) is the same as that of the GLS. The second part of the loss function involves the regularization of parameters because it penalizes larger coefficient values. It is imperative that as λ 0 , β ^ M D S β ^ G L S . In the presence of a non-zero penalty term, if α = 0 and Ω = σ 2 I , β ^ M D S = β ^ R i d g e , and if α = 1 and Ω = σ 2 I , β ^ M D S = β ^ L A S S O .
The basic properties of the MDS estimate are given by the following result.
Lemma 1.
The MDS estimate given by  β ^ M D S  =  Q X Ω 1 Y  with  Q = [ ( X Ω 1 X ) + λ ( X Ω 1 X ) 1 ] 1  is biased and has a variance of  V ( β ^ M D S )  =  Q X Ω 1 X Q .
Proof. 
See the Appendix A. □
Regarding the bias and MSE of regression predictions using MDS estimates, we have the following result.
Lemma 2.
The bias and the MSE of  X β ^ M D S  are given by  λ X Q ( X Ω 1 X ) 1 β  and  T r [ X Ω 1 X X Q ) X Q + λ 2 ( Ω 1 Q 1 β ) 1 X X ( Q 1 Ω 1 β ) , respectively.
Proof. 
See the Appendix A. □
We observe that, when λ = 0 , Q = ( X Ω 1 X ) 1 , and, consequently, M S E X β ^ M D S = M S E X β ^ G L S . Finally, regarding the efficiency of the MDS estimates, we provide the following result.
Lemma 3.
The difference between the variance of  β ^ G L S  and  β ^ M D S  is non-negative definite.
Proof. 
See the Appendix A. □
It is important to recognize that the MDS estimate in (1) allows the minimization of RSS as well as a specific norm for the beta vector scaled by the variance–covariance matrix. Since the GLS estimate has no bias, its MSE is composed of full variance that can cause overfitting, which may further lead to inaccurate predictions. For the MDS estimate, however, we incorporate a covariance-adjusted penalty through the beta vector and adjust the variance of the shrinkage-based estimate. Consequently, we end up with an estimated model that is not overfitted and is thus capable of producing better predictions. This is seen in our empirical results presented in the next section.
We recognize that there are some existing works, such as that of Belloni et al. (2012), who use Lasso and post-Lasso for estimating the first-stage regression of endogenous variables on the instruments. In their method, the Lasso estimator selects instruments and estimates the first-stage regression coefficients via a shrinkage procedure. The post-Lasso estimator of Belloni et al. (2012) discards the Lasso coefficient estimates and uses the data-dependent set of instruments selected by Lasso to refit the first-stage regression via OLS to alleviate Lasso’s shrinkage bias. We stress that our objective in this paper is not to demonstrate efficiency gains from using optimal instruments. Since heteroskedasticity does not introduce bias to the estimator, one should not count on modeling the changing variance to produce superior forecasting results. Instead, we explore a practical penalized regression method in the presence of heteroskedasticity, which may provide a useful complement to existing approaches.

4. Empirical Analysis of ESG Firms

For the empirical illustration, we use the firm-level data from January 2009 to December 2022. We restrict our sample to companies with ESG ratings from Morningstar Sustainalytics. Morningstar rating measures the degree to which individual companies face financial risks from ESG issues, and then it rolls those individual scores up into an overall, portfolio-level score.1 In addition to Morningstar Sustainalytics, we utilize the CRSP-COMPUSTAT merged database and Kenneth French’s data library. Beginning in January 2009, each month we ranked every stock by its ESG rating from Sustainalytics and created three portfolios using ESG ratings. During the sample period, we also ranked all stocks into three size categories based on the market capitalization of each firm with an ESG rating. As a result, we obtained nine portfolios and obtained the time series of returns for each portfolio. The empirical methodology implemented in this paper is consistent with much of the prior literature.
Table 1 reports the average excess returns of nine portfolios constructed by independent sorts based on ESG rating and market cap. In our sample, firms with poor ESG ratings tend to be small firms, while firms with better ESG ratings tend to have large market capitalization. As is well known in the literature, it is visible in Table 1 that there is an inverse relationship between firm size and average excess returns in our sample. The average excess return of three small and three big portfolios is 0.52% per month and 0.41% per month, respectively. The table also shows an inverse relationship between ESG ratings and average excess returns. As an example, for the small-cap portfolio, the average excess return monotonically decreases from 0.59% per month for the worst ESG rating portfolio to 0.46% per month for the best ESG rating portfolio. On average, irrespective of any size group we consider, the portfolio of firms with the best ESG ratings produces the lowest average returns than the portfolio of firms with the worst ESG ratings. As shown by the long-short portfolio, the difference between the worst and the best ESG-rating portfolio’s average return is not statistically significant.
Next, for each portfolio, we obtain the risk-adjusted return using the Fama and French (2015, 2016) multifactor risk model. Let R i , t + 1 be the excess returns of i th portfolio at period t + 1 . A full specification of the 5-factor Fama–French time-series regression model takes the following form:
R i , t + 1 = α i + β i , M k t r f M k t r f t + 1 + β i , S M B S M B t + 1 + β i , H M L H M L t + 1 + β i , C M A C M A t + 1 + β i , R M W R M W t + 1 + e i , t + 1 , t = 1 , , T
where M k t r f is the excess market return, S M B is the size factor, H M L is the value factor, C M A is the investment factor, and R M W is the profitability factor. While the size and value factors are empirically motivated, the investment and profitability factors are theoretically motivated. The addition of C M A and R M W captures the drivers of expected returns in the q-factor model and the size and value factors are associated with the static dimensions of the firms (Hou et al. 2015, 2020).
If the alphas corresponding to the 5-factor model is significant, it suggests that the risk factors are not successful in explaining the abnormal returns of the portfolio. The risk-adjusted returns of all nine double-sorted portfolios are also reported in the lower panel of Table 1. None of the alphas corresponding to the 5-factor FF regression model are statistically significant, at least at the 5% level. Unlike the average excess returns, the estimated alphas do not display any patterns as well. The insignificant alphas corresponding to the 5-factor model suggest that the risk-adjusted portfolio return does not persist after controlling for the risk factors. Although investors require return compensation for holding ESG stocks, the risk of a poor ESG rating portfolio is comparable to that of a good ESG rating portfolio, which suggests an additional risk dimension not incorporated into the literature.
As a next step, after obtaining the factor exposures from the time-series regression (2), the factor betas are used as explanatory variables in the following cross-sectional regression:
R ¯ i = γ 0 + γ 1 β ^ i , M k t r f + γ 2 β ^ i , S M B + γ 3 β ^ i , H M L + γ 3 β ^ i , C M A + γ 3 β ^ i , R M W + ε i ,   i = 1 , , 9
Traditionally, for evaluating a portfolio return, the 5-factor model includes a set of five primitive predictors in the time-series regression (2). As a result, the benchmark factor betas shows up in linear form in the cross-sectional regression (3). If the portfolio’s loadings with respect to the risk factors are important determinants of average returns, the slope coefficients of β ^ i , j ’s from (3) should be statistically significant. In order to allow non-sparse covariance structure and heteroskedasticity, we evaluate two variations of the simulated cross-sectional regression. Note that the double-sorted portfolio construction scheme ensures that the test portfolios differ in their level of ESG ratings and market cap. The portfolio-level data that we use in (2) and (3) also helps us to avoid issues such as infrequent trading and outliers.
First, in order to allow non-sparse residuals, we assume that C o r r ε i , ε j = Ω i j = ρ i j for all i ,   j . Consequently, we construct Ω = Ω i j and compute the GLS and MDS estimates of the slope coefficients. We identify neighbors using the relative ESG rating and market cap of each portfolio and assume that the errors take a simple autoregressive form ε i + 1 = ρ ε i + ϑ t with ϑ t ~ N ( 0 , θ 2 ) . We use five different pre-assigned values for ρ —0.10, 0.20, 0.50, 0.80, and 0.90—and experiment with seven alternative pre-determined values of the tuning parameter λ —0.5, 1.0, 3.0, 5.0, 10, 20, and 50. The MSE from the testing set using various methods is given in Table 2. We see that irrespective of any ρ we consider, the MDS regressions corresponding to λ < 5 always produce a smaller MSE than the GLS counterpart. Even for high λ , the MSE of GLS regressions exceeds those of MDS. For example, for λ = 5 , the testing sample MSE for MDS regressions varies between 0.0491 and 0.0570, whereas the same for GLS is between 0.0630 and 0.0697. The general trend that the MSE of GLS is higher than the MDS regressions suggests an improvement in the in-sample predictive accuracy.
Next, we implement a form of heteroskedastic regression by modeling the error variance with non-linear forms of heteroskedasticity. More specifically, we assume that the error variance is a function of all squares, cubes, and interactions of the factor betas. As a result of the incorporation of all the auxiliary variables, the number of parameters in the variance becomes 26.
Table 3 presents summary statistics for the out-of-sample predictions from heteroskedastic cross-sectional regressions where we split all testing portfolios into training and testing samples. We use the training sample for estimating prediction models and use the testing sample to assess how the model performs. To see the sensitivity of our results, we split the sample 100 times. Each time, we randomly select 56% of observations to create a training set and assign the remaining 44% set of the available data into a testing set. To maintain consistency with earlier results, we report the out-of-sample prediction errors GLS and MDS corresponding to seven alternative pre-determined values of λ as in Table 2.
A close inspection of the reported results reveals that some MDS regressions have very similar values for average squared error and absolute error. For a large number of λ ’s, the average squared error is very close to 0.19. Compared to GLS, the MDS regression with λ = 0.5 results in a 5% reduction in average predicted errors. For λ = 1 and 3, the reductions in the average squared prediction errors become 9% and 6%, respectively. Similar observations can be made for average and median absolute error. In sum, we find that, based on the out-of-sample prediction errors, the slight dominance of MDS over GLS persists even in our heteroskedastic regressions. Therefore, the incorporation of covariance-adjusted shrinkage improves the out-of-sample performance of the ESG portfolio’s expected returns using the heteroskedastic cross-sectional regression model.

5. Conclusions

In this paper, we use a new covariance-adjusted shrinkage method and examine the risk–return tradeoff of a portfolio of companies that have tangible environmental, social, and governance (ESG) attributes from Morningstar Sustainalytics. We introduce a new type of penalized regression that minimizes the sum of residual squares subject to Mahalanobis distance and shows that the resulting estimates are biased but efficient compared to unconstrained GLS estimates. The results show that there is an absence of significant risk-adjusted returns of various portfolios constructed using ESG ratings. Furthermore, we find that the risks associated with various ESG rating-based portfolios are comparable. The incorporation of the covariance-adjusted shrinkage method in the cross-sectional regression also improves the predictive accuracy of expected returns. Ours is the first paper to provide a covariance-adjusted penalty that can be important for applied economic analysis. In terms of practical applications, our method is simple to implement and provides a compromise between model interpretation and improved prediction performance. If a researcher is concerned about heteroskedasticity, our method can be complementary to other existing approaches. Note that our methodology is not free from limitations. Our approach in this paper is pragmatic and based on the ease of an empirical appreal. Future work can focus on the theoretical and asymptotic properties of the MDS estimates.

Funding

This research received no external funding.

Data Availability Statement

Restrictions apply to the availability of these data. Data were obtained from WRDS and are available with the permission of WRDS.

Conflicts of Interest

The author declares no conflict of interest.

Appendix A

Proof of Lemma 1.
For the objective function (1), the first-order condition with respect to β is
X   Ω 1 X + λ   ( X Ω 1 X ) 1   β = X   Ω 1 Y
To solve this, we obtain
β ^ M D S = Q ( X Ω 1 Y ) ,   where   Q = [ ( X   Ω 1 X ) + λ   ( X Ω 1 X ) 1 ]   1 = P β ^ G L S ,   where   P = Q X Ω 1 X   and   β ^ G L S = ( X   Ω 1 X )   1 X Ω 1 Y .
The expectation of β ^ M D S is given by
E β ^ M D S = E Q ( X Ω 1 Y ) = Q X Ω 1 E Y = Q X     Ω 1 X β =   X Ω 1 X + λ   ( X Ω 1 X ) 1 1   X Ω 1 X + λ   ( X Ω 1 X ) 1 β λ Q ( X   Ω 1 X ) 1 β = β λ Q ( X Ω 1 X ) 1 β β
The variance of β ^ M D S is given by
V   β ^ M D S = P V   β ^ G L S P = Q   X Ω 1 X ( X   Ω 1 X ) 1   X Ω 1 X Q = Q ( X Ω 1 X ) Q
Proof of Lemma 2.
Using Lemma 1, the expectation of X β ^ M D S can be expressed by
E X β ^ M D S = X β λ X Q ( X Ω 1 X ) 1 β
So, the bias is
B i a s X β ^ M D S = E X β ^ M D S X β = λ X Q ( X Ω 1 X ) 1 β
The variance of X β ^ M D S   is given by
V X β ^ M D S = X V β ^ M D S X = X Q ( X Ω 1 X ) Q X = X Q X Ω 1 X X Q
And
B i a s X β ^ M D S 2 2 = E X β ^ M D S X β E X β ^ M D S X β = λ X Q ( X   Ω 1 X ) 1 β λ X Q ( X   Ω 1 X ) 1 β = λ 2 β X Ω 1 X 1 Q X X Q ( X Ω 1 X ) 1 β
Hence, the MSE is
M S E X β ^ M D S = T r V X β ^ M D S + B i a s X β ^ M D S 2 2 = T r X Q ( X Ω 1 X ) Q X + λ 2 β ( X Ω 1 X ) 1 Q X X   Q ( X Ω 1 X ) 1 β = T r [ X Ω 1 X X Q ) X Q + λ 2 Q ( X   Ω 1 X ) 1 β ) X X   ( Q ( X Ω 1 X ) 1 β )
Proof of Lemma 3.
V ( β ^ G L S ) V ( β ^ M D S )   = ( X Ω 1 X ) 1 Q ( X Ω 1 X ) Q = Q Q 1 X Ω 1 X 1 Q 1 X Ω 1 X Q = Q [ X Ω 1 X + λ ( X Ω 1 X ) 1 X Ω 1 X 1 X Ω 1 X + λ ( X Ω 1 X ) 1 X Ω 1 X ] Q = Q [ I + λ X Ω 1 X 2 X Ω 1 X + λ ( X Ω 1 X ) 1 X Ω 1 X ] Q = Q 2   λ ( X Ω 1 X ) 1 + λ 2 ( X Ω 1 X ) 3 Q = [ ( X   Ω 1 X ) + λ   ( X Ω 1 X ) 1 ]   1 2   λ ( X Ω 1 X ) 1 + λ 2 ( X Ω 1 X ) 3 [ ( X   Ω 1 X ) + λ   ( X Ω 1 X ) 1 ]   1 ,
Since each of the components of the matrix product is non-negative definite, the difference is non-negative definite. □

Note

1
The rating is expressed as 1 to 5 globes. A higher number of globes indicates that the portfolio has lower (or negligible) ESG risk and vice versa. Accessed on 12 November 2023. See https://www.morningstar.com/sustainable-investing for details.

References

  1. Angrist, Joshua David, and Jörn-Steffen Pischke. 2008. Mostly Harmless Econometrics: An Empiricist’s Companion. Princeton: Princeton University Press. [Google Scholar]
  2. Athey, Susan. 2018. The impact of machine learning on economics. In The Economics of Artificial Intelligence: An Agenda. Edited by Ajay K. Agrawal, Joshua Gans and Avi Goldfarb. Chicago: University of Chicago Press. [Google Scholar]
  3. Bannier, Christina E., Yannik Bofinger, and Björn Rock. 2023. The risk-return tradeoff: Are sustainable investors compensated adequately? Journal of Asset Management 24: 165–72. [Google Scholar] [CrossRef]
  4. Belloni, A., D. Chen, V. Chernozhukov, and C. Hansen. 2012. Sparse models and methods for optimal instruments with an application to eminent domain. Econometrica 80: 2369–429. [Google Scholar]
  5. Berrendero, José R., Beatriz Bueno-Larraz, and Antonio Cuevas. 2020. On Mahalanobis distance in functional settings. Journal of Machine Learning Research 21: 1–33. [Google Scholar]
  6. BlackRock. 2020. Global Sustainable Investing Survey. Available online: https://img.lalr.co/cms/2021/05/28202727/blackrock-sustainability-survey.pdf (accessed on 1 July 2021).
  7. Bolton, Patrick, and Marcin Kacperczyk. 2021. Do investors care about carbon risk? Journal of Financial Economics 142: 517–49. [Google Scholar] [CrossRef]
  8. Chernozhukov, Victor, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, and Whitney Newey. 2017. Double/Debiased/Neyman machine learning of treatment Effects. American Economic Review 107: 261–65. [Google Scholar] [CrossRef]
  9. Cohen, Gil. 2023. The impact of ESG risks on corporate value. Review of Quantitative Finance and Accounting 60: 1451–68. [Google Scholar] [CrossRef]
  10. Fama, Eugene F., and Kenneth R. French. 2015. A five-factor asset pricing model. Journal of Financial Economics 116: 1–22. [Google Scholar] [CrossRef]
  11. Fama, Eugene F., and Kenneth R. French. 2016. Dissecting anomalies with a five-factor model. The Review of Financial Studies 29: 69–103. [Google Scholar] [CrossRef]
  12. Giese, Guido, Linda-Eling Lee, Dimitris Melas, Zoltán Nagy, and Laura Nishikawa. 2019. Foundation of ESG investing: How ESG affects equity valuation, risk, and performance. The Journal of Portfolio Management 45: 69–83. [Google Scholar] [CrossRef]
  13. Hoerl, Arthur E., and Robert W. Kennard. 1970. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 12: 55–67. [Google Scholar] [CrossRef]
  14. Hou, Kewei, Chen Xue, and Lu Zhang. 2015. Digesting anomalies: An investment approach. The Review of Financial Studies 28: 650–750. [Google Scholar] [CrossRef]
  15. Hou, Kewei, Chen Xue, and Lu Zhang. 2020. Replicating Anomalies. The Review of Financial Studies 33: 2019–133. [Google Scholar] [CrossRef]
  16. Jia, Jinzhu, Karl Rohe, and Bin Yu. 2013. The Lasso under poisson-like heteroskedasticity. Statistica Sinica 23: 99–118. [Google Scholar]
  17. Kapetanios, George, and Filip Zikes. 2018. Time-varying Lasso. Economics Letters 169: 1–6. [Google Scholar] [CrossRef]
  18. Mahalanobis, P. C. 1936. On the generalized distance in statistics. Proceedings of National Institute of Sciences 2: 49–55. [Google Scholar]
  19. Mullainathan, Sendhil, and Jann Spiess. 2017. Machine learning: An applied econometric approach. Journal of Economic Perspectives 31: 87–106. [Google Scholar] [CrossRef]
  20. Pástor, Lubos, Robert F. Stambaugh, and Lucian A. Taylor. 2022. Dissecting green returns. Journal of Financial Economics 146: 403–24. [Google Scholar]
  21. Safiullah, Md, Md Samsul Alam, and Md Shahidul Islam. 2022. Do all institutional investors care about corporate carbon emissions? Energy Economics 115: 106376. [Google Scholar] [CrossRef]
  22. Starks, Laura T. 2023. Presidential Address: Sustainable finance and ESG issues—Value versus Values. The Journal of Finance 78: 1837–72. [Google Scholar] [CrossRef]
  23. Stock, James H., and Mark W. Watson. 2019. Introduction to Econometrics. New York: Pearson. [Google Scholar]
  24. Tibshirani, Robert. 1996. Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society Series B (Statistical Methodology) 58: 267–88. [Google Scholar] [CrossRef]
  25. Varian, Hal R. 2014. Big data: New tricks for Econometrics. Journal of Economic Perspectives 28: 3–28. [Google Scholar] [CrossRef]
  26. Zou, Hui, and Trevor Hastie. 2005. Regularization and variable selection via the Elastic Net. Journal of the Royal Statistical Society Series B (Statistical Methodology) 67: 301–20. [Google Scholar] [CrossRef]
Table 1. Performance of nine ESG-rating- and size-sorted portfolios.
Table 1. Performance of nine ESG-rating- and size-sorted portfolios.
Size
SmallMediumBigAverageLong–Short
Average excess returns
ESG ratingWorst0.59 ***0.54 **0.49 **0.54 **0.10
Medium0.52 **0.41 **0.35 *0.43 **0.17
Best0.46 **0.37 *0.40 *0.41 **0.06
Average0.52 **0.44 **0.41 **
Long–Short0.130.170.09
Risk-adjusted returns
ESG ratingWorst0.009 *−0.005−0.007
Medium−0.0050.0060.001
Best0.008 *0.007−0.003
Note: *, **, and *** denote a rejection of the null hypothesis at the 10%, 5%, and 1% levels, respectively.
Table 2. Testing sample MSE from GLS and MDS cross-sectional regressions.
Table 2. Testing sample MSE from GLS and MDS cross-sectional regressions.
λ
0.5135102050
GLSMDSMDSMDSMDSMDSMDSMDS
ρ
0.10.06250.04760.04880.04900.04940.04930.04950.0518
0.20.06300.04830.04930.04920.04910.04870.05100.0529
0.50.06380.05120.05220.04370.05340.05350.05370.0544
0.80.06800.05500.05450.05660.05650.05710.05640.0573
0.90.06970.05640.05700.05740.05700.05720.05810.0586
Table 3. Performance of alternative methods under heteroskedastic cross-sectional regressions for the testing set.
Table 3. Performance of alternative methods under heteroskedastic cross-sectional regressions for the testing set.
λ
0.5135102050
Summary StatisticsGLSMDSMDSMDSMDSMDSMDSMDS
Mean Squared Error0.20270.19050.18540.19180.19390.19250.19300.1937
Mean Absolute Error0.30910.28030.22500.21630.21590.21470.21450.2155
Median Absolute Error0.25390.22410.22450.2440.23340.26400.24490.2437
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Simlai, P.E. Risk Characterization of Firms with ESG Attributes Using a Supervised Machine Learning Method. J. Risk Financial Manag. 2024, 17, 211. https://doi.org/10.3390/jrfm17050211

AMA Style

Simlai PE. Risk Characterization of Firms with ESG Attributes Using a Supervised Machine Learning Method. Journal of Risk and Financial Management. 2024; 17(5):211. https://doi.org/10.3390/jrfm17050211

Chicago/Turabian Style

Simlai, Prodosh Eugene. 2024. "Risk Characterization of Firms with ESG Attributes Using a Supervised Machine Learning Method" Journal of Risk and Financial Management 17, no. 5: 211. https://doi.org/10.3390/jrfm17050211

APA Style

Simlai, P. E. (2024). Risk Characterization of Firms with ESG Attributes Using a Supervised Machine Learning Method. Journal of Risk and Financial Management, 17(5), 211. https://doi.org/10.3390/jrfm17050211

Article Metrics

Back to TopTop