Skip to main content

ESG scores, scandal probability, and event returns

Abstract

The informativeness of environmental, social, and governance (ESG) scores and their actual impact on firms remains understudied. To address this gap in the literature, we make theoretical predictions and conduct empirical research revealing that a high ESG score is associated with a lower probability of ESG scandals and lower stock returns during a scandal event. Our results suggest that ESG scores are heterogeneous but informative, and that a strong ESG reputation may have both positive and negative consequences for firms. Drawing on our findings, we develop a model and showcase that firms face an optimization problem when determining optimal ESG investment levels. Two equilibria may exist based on the trade-off between ESG scandal losses and ESG adjustment costs. Our model explains why certain firms make heterogeneous ESG decisions

Introduction

The rise of environmental, social, and governance (ESG)-oriented investment in recent years has called for a deeper understanding of the effects of ESG ratings on stock markets. People tend to believe in human goodwill and to, in turn, perceive that socially responsible firms can be rewarded in the capital market. However, the current empirical evidence does not fully support this view. For example, Hartzmark and Sussman (2019) found that while investors predicted that high-sustainability firms would have better future performance, they did not find empirical evidence to support this view in the fund market. Meanwhile, Serafeim and Yoon (2023) recognized that ESG ratings could predict future ESG news, but they did not observe any connection between ESG ratings and market reactions to ESG news. Larcker and Watts (2020) similarly documented that investors perceive green and non-green securities issued by the same issuer as nearly identical substitutes, with no premium for being green. Hsu et al. (2023) found that highly-toxic emission firms generate higher stock returns than their counterparts, which can be attributed to the risk premium associated with the environmental policy risk they face.

While the literature generally suggests that ESG has a muted or negative effect on stock market performance, some evidence supports a positive correlation. Ferrell et al. (2016) found a positive relationship between corporate social responsibility and firm value.Footnote 1 Avramov et al. (2021) theoretically inferred that a green-minus-brown portfolio can generate significantly positive payoffs over a reasonably long investment horizon. In two field surveys conducted in the United States of America, Bauer et al. (2021) found that participants were willing to expand their fund engagement based on ESG, although they expected this engagement to potentially harm financial performance; this inflow of funds into higher ESG-rated firms would naturally increase their stock prices. Ding et al. (2021) also observed that high ESG-rated firms experienced smaller negative stock returns during the stock market turbulence caused by the COVID-19 pandemic.

From a corporate finance perspective, there is evidence to support the link between firm valuation and ESG performance. Regarding the impact of ESG performance on cash flows (product market sales), Dai et al. (2021) found that corporate customers can influence a supplier’s corporate social responsibility performance. Ding et al. (2022) discovered that firms enhance their ESG performance to differentiate their products in response to intense product market competition, and that, currently, firms generally use high ESG performance to attract or retain both corporate and retail customers. Regarding the impact of ESG performance on discount rates, Cheng et al. (2014) documented that higher corporate social responsibility performance enables better access to financing, which in turn lowers the cost of capital and, accordingly, the discount rate. Some past studies therefore suggest that higher ESG performance enhances firm valuation. However, considering all the aforementioned studies, it is still unclear how ESG affects long-term stock performance.

Several factors may contribute to the inconclusiveness of this topic. One main argument is the low quality of ESG ratings. There is a growing number of ESG rating agencies worldwide, including Thomson Reuters ASSET4, S&P ESG (formerly RobecoSAM), S&P Trucost, MSCI ESG (formerly MSCI KLD and MSCI IVA), Sustainalytics, and Bloomberg ESG; the issue here is that not only do the rating methodologies and data sources vary across these raters, but they also change over time, and this is even within a specific vendor (Gibson et al. 2021; Christensen et al. 2021; Avramov et al. 2022; Serafeim and Yoon 2023). These characteristics imply that, despite substantial investment in the accumulation of ESG data and the construction of rating methodologies, it is unlikely that every evaluation accurately mirrors a company’s true ESG profile. Therefore, both the industry and academia are seeking a better understanding of the validity of the various ESG scores. In this study, we combine a micro-founded model with empirical evidence to study the quality of ESG scores and how they can be utilized to generate better predictive performance.

We first constructed a model to illustrate how a rational investor combines different ESG scores and adjusts a firm’s valuation; this model assumes a constant true ESG level for a firm that can influence both ESG ratings and the likelihood of an ESG scandal. Initially, the investor receives inaccurate ESG scores and trades the stock accordingly. Subsequently, the firm may or may not face an ESG scandal and the investor may adjust its valuation based on the scandal’s occurrence. In the final stage, firm cash flow is realized, with a negative adjustment if an ESG scandal occurs in the previous stage. This reflects the market penalty imposed on firms with low ESG performance, as Ding et al. (2022) suggested.

This model generated two testable predictions. First, it suggests that if an ESG scandal occurs, a firm will experience a negative stock market response. Second, although it is commonly believed that higher ESG-rated firms are more resilient during crises, our model implies that higher ESG-rated firms experience more negative stock returns in the event of an ESG scandal. These two perspectives do not contradict each other, as a high ESG-rated firm may be robust against overall market crises and less so when facing its own scandal. This is because ESG scandal occurrence not only implies future penalties in the product market but also a higher likelihood of an overestimation of company ESG score. Both factors contribute to reduced valuation.

To validate our model’s key assumptions, we conducted empirical analyses using large-scale ESG ratings and news data. Our findings demonstrate that a higher ESG score generally indicates a lower probability of an ESG scandal. From the stock market perspective, we observed a negative cumulative abnormal return surrounding ESG scandals, and higher ESG-rated firms exhibited lower returns during ESG scandals. We also found that a greater disagreement in ESG scores is associated with lower scandal-related returns.

Based on our empirical evidence, which indicated that higher ESG ratings reduce the likelihood of a scandal but increase the associated losses, we developed a model to illustrate that firms face a trade-off between ESG investment cost and the impact of a scandal. Our findings showcase that this trade-off determines the optimal level of ESG investment for a firm. There are two types of equilibrium in this context, as described herein: a firm with a suboptimal ESG evaluation may incur prohibitive costs when attempting to substantially enhance its ESG rating; then, if the advantages of improved ESG metrics (e.g., minimized repercussions from scandals) fail to surpass the expenditures associated with ameliorating ESG perception, the optimal strategy for the entity would likely be to maintain a suboptimal ESG rating. Our model further suggests that social preferences regarding ESG (i.e., determine the payoff in the product market), firm size, and initial ESG rating levels are key factors in determining a firm’s optimal ESG investment decisions.

Our study contributes to several literature fields, mostly studies on the signaling effect of ESG scores. Yoon et al. (2006) found that the sincerity of motives determines the effectiveness of corporate social responsibility activities. Dunbar et al. (2021) observed that the motivation for a firm’s ESG engagement is more questionable when the firm is of a high reputation and involved in misconduct. Our study contributes to this field by deriving separate equilibria for firms’ optimal ESG investments. These equilibria have implications for practitioners when considering their ESG investment decisions based on the social preferences for ESG, firm size, and initial ESG level. To the best of our knowledge, this study is the first to deliver these pieces of evidence.

Our study also contributes to the literature on ESG rating properties. As aforementioned, there is an ongoing debate on whether ESG scores are informative (Bartov et al. 2020; Gibson et al. 2021; Christensen et al. 2021; Berg et al. 2022; Avramov et al. 2022). Meanwhile, this study shows that despite the heterogeneity of ESG score, related scores remain capable of providing predictive power, at least to some extent, for ESG scandal probability and scandal-related returns.

Finally, we contribute to discussions on the real impact of ESG. Several studies show that corporate customers’ and investors’ ESG preferences shape firm ESG practices and financial decisions (Ferrell et al. 2016; Capelle-Blancard and Petit 2019; Chen et al. 2020; Dai et al. 2021). We add to this field by developing a model that characterizes the optimization problem faced by firms with ESG concerns. In particular, we introduce four ESG investment-related dimensions that can be empirically tested in future studies: firm size, market ESG preference, and initial reputation level. Furthermore, investors’ ESG preferences can be split into value alignment and impact seeking (Bonnefon et al. 2022) and individually tested.

The remainder of this paper is organized as follows. “Data” Section describes the data used in our empirical analyses. “Analysis” Section presents the development of the model used to generate testable predictions, illustrates the adopted empirical methodology, and discusses the empirical results. “Firm’s optimal ESG investment” Section introduces the firm’s optimal ESG investment problem based on our empirical findings. “Conclusion” Section concludes the manuscript.

Data

ESG scores were originally developed in the 1980s to supplement traditional financial data and provide investors with additional information about companies. These scores quantify company performance in terms of ESG issues. This study collected ESG data from four notable ESG rating providers, as follows: KLD (now MSCI), ASSET4 (now Refinitiv ESG), Sustainalytics (now Morningstar), and S&P Global. These vendors cover a significant proportion of the companies that have ESG performance ratings. Table 1 illustrates the similarities between the ESG scores of the four rating entities under scrutiny. The average ESG score of a company was computed as the mean of the four scores from the rating providers.

Table 1 Input similarity matrix

This study also used another score, the principal component analysis (PCA) ESG score, which represents the collective perspective of rating agencies toward a company at a given time. We conducted PCA on the ratings obtained from the four vendors to derive the first principal component, subsequently defining it as the PCA ESG score. The PCA ESG score captures 63% of the variance in the overall ESG scores rated by the four ESG rating agencies, as presented in Table 2.

Table 2 Principle components

Disagreement among ESG providers regarding a company’s performance can also influence stock reactions following ESG scandals. Therefore, we calculated ESG disagreement as the standard deviation of the ratings provided by the four agencies (see Appendix B for a more extensive discussion on the development of ESG scores, their distinctions, and recent studies on the subject).

To identify ESG scandals, we used event data compiled by RepRisk. RepRisk continuously monitors over 100,000 public sources and stakeholders for ESG incidents. The methodology employed was event-driven rather than company-driven. The RepRisk news dataset catalogues ESG-related incidents between 2007 and 2020. We classified incidents recorded in this dataset as ESG scandals by creating a binary variable called ESG Scandal, which is one if a firm has at least one RepRisk ESG scandal in the next 12 months, and zero otherwise.

For event returns analysis, we first compiled a list of ESG scandals and their corresponding event dates from RepRisk. We then obtained stock returns data for the involved firms around the scandal event date from the CRSP. Following Brown and Warner (1985) approach, we employ capital asset pricing model to estimate the predicted returns. The model parameters were estimated between 250 and 20 trading days prior to the scandal events. Expected and cumulative abnormal returns were subsequently calculated for the period between 10 trading days before and after the scandal events.

We also retrieved financial performance data from the Compustat Fundamentals database to address the control variables in our study. Annual financial information was systematically gathered, encompassing a range of factors such as total assets, cash and short-term investments, current assets, current liabilities, intangible assets, earnings before interest, long-term debt due in one year, total debt including current, sales, operating activities net cash flow, capital expenditures, total dividends, and net income. Drawing on the established literature on market returns correlations (Haque and Sarwar 2013; Sarwar et al. 2013; Brown and Huang 2020), 11 key financial characteristics were calculated as control variables. These variables were size, cash ratio, current ratio, intangibility, return on assets, maturing debt, leverage, growth, cash flow volatility, capital expenditure, and dividend payout ratio. The calculation for each control variable is detailed in Appendix A, along with a succinct textual explanation of the variable for clarity. Our two samples are described in “Scandal probability”, and “Event returns” Section, respectively.

Analysis

Theoretical predictions

In this section, we build a simple model to predict the relationship between ESG scores and event returns when a scandal occurs. The model considers one firm over three periods. The firm has a real ESG level \(v \in \left[ {0,1} \right]\), which is unknown to investors. At \(t = 0\), ESG rating agencies disclose ESG score \(s\sim N\left( {v,\tau } \right)\) for the firm. \(\tau\) is the rating difficulty of the firm, \(s\) follows normal distribution. In \(t = 1\), the firm is exposed to an ESG scandal at a probability \(p\left( v \right)\), where \(p\left( \cdot \right) \in \left[ {0,1} \right]\) and \(p^{\prime} < 0\). This leads to the following hypothesis:

Hypothesis 1

Firms with high ESG rating have a lower probability of experiencing an ESG scandal.

At \(t = 2\), the firm realizes its product sales \(D\left( {S,\lambda } \right) = 1 - S\lambda\), where \(\lambda \in \left[ {0,1} \right]\) is the aggregated customer’s ESG preference and \(S \in \left\{ {0,1} \right\}\) is a dummy indicating whether the scandal happened or not. Importantly, the formulation assumes a constant level of price elasticity throughout the analysis, implying that the relationship between product sales and aggregated customer’s ESG preference, as captured by the function \(D\left( {S,\lambda } \right),\) is based on a consistent price elasticity factor. Although this assumption simplifies the analysis, we acknowledge that price elasticity variations can affect revenue sustainability; accordingly, sensitivity analyses that explore different elasticity scenarios may provide additional insights into our model’s robustness.

We then derive the model to generate predictions. At \(t = 0\), the value (product sale) of the firm can be calculated by computing the mathematical expectation of ESG scandal occurrence, as followsFootnote 2:

$$\begin{aligned} E\left[ {D{|}s} \right] = & p\left( s \right)D\left( {1,\lambda } \right) + \left[ {1 - p\left( s \right)} \right]D\left( {0,\lambda } \right) \\ = & 1 - p\left( s \right)\lambda . \\ \end{aligned}$$
(1)

Here \(D\left( {1,\lambda } \right) = 1 - \lambda\) is the product sales when the scandal happens, and \(D\left( {0,\lambda } \right) = 1\) is the product sales when the scandal does not happen. After \(t = 1\), it is revealed whether the ESG scandal happens. The value of the firm should be \(1 - S\lambda\), making the return for period \(t = 1\) be

$$\begin{array}{*{20}{c}} {R = \frac{1 - S\lambda }{{1 - p\left( s \right)\lambda }} - 1} \end{array}$$
(2)

If there is no scandal revealed in \(t = 1\), the return \({R_{{\text{S}} = 0}}\) is

$$\begin{array}{*{20}{c}} {{R_{{\text{S}} = 0}} = \frac{1}{1 - p\left( s \right)\lambda } - 1 > 0.} \end{array}$$
(3)

If a scandal happened, the return \({R_{{\text{S}} = 1}}\) is

$$\begin{array}{*{20}{c}} {{R_{{\text{S}} = 1}} = \frac{1 - \lambda }{{1 - p\left( s \right)\lambda }} - 1 < 0.} \end{array}$$
(4)

This generates a testable prediction, as follows,

Hypothesis 2

The investor penalizes the firm that experiences an ESG scandal on the stock market.

Since \(p^{\prime}\left( {\text{s}} \right) < 0\), the partial derivative with regard to \(s\) is

$$\begin{array}{*{20}{c}} {\frac{{\partial {R_{{\text{S}} = 1}}}}{\partial s} = \frac{{\left( {1 - \lambda } \right)\lambda p^{\prime}\left( {\text{s}} \right)}}{{{{\left( {1 - p\left( s \right)\lambda } \right)}^2}}} < 0.} \end{array}$$
(5)

This generates another testable prediction, as described herein,

Hypothesis 3

If the ESG score is higher, the return on scandal event is lower.

Scandal probability

Following the methodology outlined by Anderson et al. (2018) and Agrawal et al. (2022), we employed a panel logistic Poisson regression with fixed effects to investigate the influence of ESG scores on the probability of scandals within companies during a corresponding month. Our primary objective was to discern whether ESG scores impact the likelihood of a company experiencing a scandal, and if so, to what extent. To achieve this, we estimate scandal probability using the following model:

$$\begin{array}{*{20}{c}} {Scanda{l_{c,t}} = {\beta_1}Scor{e_{c,t}} + {\beta_2}Disagre{e_{c,t}} + \Gamma {F_{c,t}} + {\delta_c} + {\gamma_t} + {\varepsilon_{c,t}},} \end{array}$$
(6)

where \(Scanda{l_{c,t}}\) is a dummy indicating whether firm \(c\) will experience an ESG scandal in the next 12 months after time \(t\). \(Scor{e_{c,t}}\) is the ESG score of firm \(c\) at time t. We calculate \(Disagre{e_{c,t}}\) as the standard deviation of the ESG scores from the four rating agencies of S&P, ASSET4, KLD, and Sustainalytics. \({F_{c,t}}\) is a vector of company fundamentals comprising 11 factors, namely size, cash ratio, current ratio, intangibility, return on assets, maturing debt, leverage, growth, cash flow volatility, capital expenditure, and dividend payout ratio. \({\delta_c}\) is the company fixed effect, which serves to condition out the company-invariant effect on scandal probability. Similarly, the time fixed effect \({\gamma_t}\) is to condition out the time-invariant impact on scandal probability.

In our experiments, we conducted six fixed-effect linear regressions using different sets of independent variables. To provide more details, we employed six different ESG scores (i.e., S&P, ASSET4, KLD, Sustainalytics, average scores of the four, and the first PCA ESG score derived from the four) in combination with the same set of company characteristics to evaluate the probability of a company scandal.

We present the summary statistics for the sample in Panel A of Table 3, which consists of 127,478 firm-month observations. An average ESG scandal score of 0.26 suggests a 26% likelihood of a firm experiencing an ESG scandal within any given 12-month interval. This figure may appear elevated, reflecting RepRisk’s expansive criteria for identifying ESG scandals; that is, some scandal events reported by RepRisk may not be significant or well-known. To ensure consistency, we standardize all ESG scores to have a mean of zero and a standard deviation of one. Standardized ESG scores were then used to calculate the average and the PCA ESG scores.

Table 3 Summary descriptives

The empirical effect of ESG scores on scandal probability was tested, and the regression results are presented in Table 4. As shown in Table 4, the S&P, KLD, average, and PCA ESG scores had significant and negative effects on scandal occurrence. The Sustainalytics score also exhibited negative associations with scandal probability, albeit not to a statistically significant extent. In terms of economic magnitude, a one standard deviation increase in ESG scores corresponds to a 0.1–1.1% decrease in scandal probability. These findings generally support Hypothesis 1. We provide additional empirical results based on the alternative specifications in Panel B of Table 4.

Table 4 ESG score and scandal probability

The coefficient estimates of the control variables supported the validity of the measures. Larger firms tended to have a higher probability of ESG scandals, possibly due to the greater scrutiny they receive. Firms with higher intangibility and profitability had a lower probability of ESG scandals, indicating that they can prioritize good governance and social responsibility because of less earnings pressure. Conversely, high leverage was associated with a higher probability of ESG scandals, highlighting the correlation between financial and ESG risks. A robust sales growth not only provides companies with financial stability and resources to invest in ethical practices (Bint-Tariq and Nobanee 2020), but also incentivizes a commitment to ESG principles, fostering stakeholder trust, regulatory compliance, and a long-term perspective, thereby reducing ESG scandal likelihood.

Event returns

To study the relationship between ESG scores and event returns, we first used the following ordinary least squares regression:

$$\begin{array}{*{20}{c}} {CA{R_{c,i,t}} = {\beta_1}Scor{e_{c,t}} + {\beta_2}Disagre{e_{c,t}} + \Gamma {F_{c,t}} + {\varepsilon_{c,i,t}}} \end{array}$$
(7)

where \(c\), \(i\), and \(t\) are the index company, scandal event, and time, respectively. \(Score\;\) is the particular ESG score being studied, and \(Disagree\) is the disagreement in ESG scores among the four data agencies. \(F\) is the vector of the firm’s fundamental characteristics. We estimate an ordinary least squares regression with robust standard errors clustered at the company level. Similar to the findings for scandal probability, we conducted six regression analyses (i.e., ASSET4, S&P, Sustainalytics, KLD, the average score of the four, and the PCA ESG score).

Panel B of Table 3 presents the summary statistics of the sample.Footnote 3 The total number of scandals was 40,045,Footnote 4 and the average ESG scandal return was − 0.29, indicating that the cumulative abnormal return for an ESG scandal event was − 0.29%. This descriptive statistic supports Hypothesis 2. However, it is important to note that this statistic is not economically significant as RepRisk identifies ESG scandals in a broad manner, including events that may not be significant or well-known.

As Table 5 shows, firms with higher ESG scores tended to experience more severe ESG scandals, as all coefficients related to ESG scores were statistically significant and negative. In terms of economic significance, a one-standard-deviation increase in ESG score was responsible for an average decrease of 7–20 basis points in event returns. This finding supports our Hypothesis 3. We also provide the empirical results for alternative specifications in Panel B of Table 5.

Table 5 ESG score and scandal returns

Furthermore, we observed that disagreements among ESG rating providers significantly affected event returns, wherein the greater the disagreement, the more negative the impact of the scandal on the company’s stock returns. Specifically, a one-standard-deviation increase in ESG disagreement was associated with a decrease of 22–26 basis points.

Additionally, we compared the predictability of various ESG scores for scandal probability and event returns (Table 6), so as to demonstrate our main results’ robustness. Importantly, this study did not specifically focus on comparing predictability across different ESG scores.

Table 6 Predictability of ESG scores

Firm’s optimal ESG investment

In this section, we construct a model to show that, based on our empirical findings (see “Analysis” Section), each firm faces an optimization problem in determining ESG investment. We define firm cash flow as

$$\begin{array}{*{20}{c}} {{X_t} = \left( {{{\Pi }_t} - {{\Phi }_t}} \right) - \left( {{{\Gamma }_t} + {{\Theta }_{\text{t}}}} \right)} \end{array}$$
(8)

where \({X_t}\) is the residual cash for the investor of a firm; \({{\Pi }_t}\) is firm gross profit; \({{\Phi }_t}\) is the adjustment costs of capital (Bonnefon et al. 2022)Footnote 5; \({{\Gamma }_t}\) is the adjustment costs of reputation capital (the cost of changing ESG level); \({{\Theta }_{\text{t}}}\) is the expected loss from an ESG scandal. Furthermore, and similar to the approach used by Liu et al. (2009), the adjustment costs of capital are defined as

$$\begin{array}{*{20}{c}} {{{\Phi }_t} = \phi \left( {\frac{I_t^K}{{K_t}}} \right){K_{t - 1}},} \end{array}$$
(9)
$$\begin{array}{*{20}{c}} {{{\Gamma }_t} = \gamma \left( {\frac{I_t^G}{{G_t}}} \right){G_{t - 1}},} \end{array}$$
(10)

where \(\phi \left( \cdot \right)\) increases with the capital investment level \(\frac{I_t^K}{{K_t}}\), and \(\gamma \left( \cdot \right)\) increases with the reputation investment level \(\frac{I_t^G}{{G_t}}.\) These are the unit costs of capital adjustment. Therefore, the total cost should be scaled by the original size of capital, \({K_{t - 1}}\), and reputation, \({G_{t - 1}}\).

Relating to our empirical findings, we calculate the expected loss from ESG scandal (\({{\Theta }_{\text{t}}}\)) similar to the calculation for expected loan loss in a commercial bankFootnote 6:

$$\begin{array}{*{20}{c}} {{{\Theta }_{\text{t}}} = P\left( {\frac{{G_t}}{{K_t}}} \right)\theta \left( {\frac{{G_t}}{{K_t}};\lambda } \right){K_t}} \end{array}$$
(11)

where \(P\left( \cdot \right)\) is scandal probability and \(P \in \left[ {0,1} \right]\); \(\theta \left( \cdot \right)\) is the loss given a scandal, while \(\theta ^{\prime} > 0\); \(\frac{{G_t}}{{K_t}}\) is the ESG level relative to firm size; \(\lambda\) is the aggregated ESG preference of a market.

Therefore, a firm faces two sub-problems in order to maximize its cash flow (\({X_t}\)): (1) maximize operating profit \({{\Pi }_t} - {{\Phi }_t}\) and (2) minimize ESG cost \({{\Gamma }_t} + {{\Theta }_{\text{t}}}\). To make this problem easier to understand, we arbitrarily assume that the function forms

$$\begin{array}{*{20}{c}} {\left\{ {\begin{array}{*{20}{l}} {P = {{\left( {1 - \frac{G}{K}} \right)}^2}} \\ {\theta = \lambda \ln \left( {1 + \frac{G}{K}} \right)} \\ {\Gamma = \gamma G} \\ {\gamma \left( {\frac{{I^G}}{G}} \right) = \left( {\frac{{I^G}}{G}} \right) + {{\left( {\frac{{I^G}}{G}} \right)}^2}} \end{array}} \right..} \end{array}$$
(12)

Figure 1 shows how expected losses from scandals change by ESG levels. When a firm is considered 100% unethical, ESG scandals imply no loss because the market already acknowledges that the firm regularly engages in unethical practices. Similarly, if a firm is perceived as 100% ethical, it has nothing to lose from an ESG scandal because the probability of its involvement in such an event is zero. Still, as the market becomes more concerned about ESG, the expected loss increases, as illustrated in Fig. 2.

Fig. 1
figure 1

Expected Loss from Scandal. This figure shows how expected loss from scandal is determined. \(P\) is the probability of scandal events. \(\theta\) is the loss given ESG scandal happens. \(P\theta\) is the expected loss from scandal. Both \(P\) and \(\theta\) are functions of \(\frac{G}{K}\), which is the standardized reputation level. \(G\) is reputation level. \(K\) is firm size

Fig. 2
figure 2

Expected Loss from Scandal under Different ESG Preference. This figure shows how expected loss from scandal is affected by different levels of ESG preference \(\lambda\). \(\frac{G}{K}\) is the standardized reputation level. \(G\) is reputation level. \(K\) is firm size

It also remains that firms cannot merely strive to be 100% ethical or unethical to minimize expected losses, as there is the existence of an adjustment cost associated with reputation, which increases proportionally to the magnitude of the adjustment. Consequently, firms must determine an optimal ESG investment level to minimize overall ESG-related costs, as Fig. 3 demonstrates. As shown in Panel A of Fig. 4, in a market where ESG preference is absent, the optimal choice for firms is not to invest in ESG, whereas an increase in ESG preference leads to an increase in optimal ESG investment. Meanwhile, Panel B of Fig. 4 indicates that for firms with poor ESG performance, increasing ESG investment may not yield sufficient benefits. Consequently, these firms may choose to remain at their original levels. For firms with high ESG performance, the required investment is relatively low. Panel C of Fig. 4 highlights the significance of matching firm size with the optimal ESG investment level.

Fig. 3
figure 3

Optimal ESG Investment. This figure shows how firm's optimal ESG investment is determined. \({\Theta }\) is the expected loss from ESG scandal. \({\Gamma }\) is the costs of ESG investment. \({\Gamma } + {\Theta }\) is the total costs of ESG. \({I^G}\) is the ESG investment level

Fig. 4
figure 4

Optimal ESG investment under different conditions. These figures show how optimal ESG investment level is changed by the changes in different parameters. \(K\) is firm size. \(G\) is reputation level. \({I^G}\) is the ESG investment level. \(\lambda\) is ESG preference

Although this study does not provide an analytical solution to the model, it presents several additional topics for future exploration. For emple, by calibrating the ESG preference parameter (\(\lambda\)), one can determine how the market views ESG. In the case of using a cross-country sample, the simplest method to estimate ESG preference is by calculating the country-fixed effects for ESG scandal event returns (the cumulative abnormal return, labeled as CAR in Table 5). Another less direct approach is to link the data on ESG scandal event returns to the World Values Survey data; this Survey dataset comprises hundreds of questions on people’s values and beliefs, and researchers can apply machine learning methods (e.g., least absolute shrinkage and selection operator, also known as LASSO) to identify the dimensions most relevant to scandal event returns and compile them into an ESG preference index for each country. There is also the more complex approach, namely, applying structural estimation by matching the reputation assets/investment (\(G\)), tangible assets (\(K\)), scandal probability (\(P\)), and firm market value times scandal return (\({\Theta }\)) data. By applying Eq. (9), we can then backout the function of \(\theta \left( {\frac{{G_t}}{{K_t}};\lambda } \right)\). With additional function form assumption, researchers can estimate ESG preference \(\lambda\) for a country. It is also the case that this model was designed as a dynamic model, and it may thus be worth trying to replace \(\lambda\) to \({\lambda_t}\) to see how ESG preference changes over time.

However, estimating a country’s preference for ESG requires a consistent measure of its reputational assets and investments, which in turn requires the development of accounting standards that enable the accurate measuring of firm ESG investments and capital. By doing so, investors can value a firm in a fair manner.

Conclusion

In this study, we firstly theoretically and empirically demonstrate that a higher ESG score is associated with (1) lower ESG scandal probability and (2) higher losses given a scandal. These findings convey two key messages, as follows: (1) ESG scores are informative, as they can predict negative ESG events; (2) high ESG-rated firms should maintain elevated standards to avoid scandals, as the losses incurred otherwise will be more significant compared to low ESG-rated firms.

Further, we outline a model that provides insights into how firms optimize their ESG investments. Our model reveals that firms have distinct equilibria for ESG investment, primarily because of the trade-off between ESG practice cost adjustments and losses incurred from ESG scandals. For low ESG-rated firms, the costs of improving and maintaining high ESG standards is higher than the losses from ESG scandals, leading their optimal strategy to be the maintenance of a low ESG. Our findings also emphasize the significance of ESG preferences in the market and suggest potential research directions for studies on ESG preferences and optimal ESG investments.

Our evidence demonstrates that if the costs of adjusting ESG practices are too high, or if the market does not consider ESG factors important, firms may choose not to invest in ESG as the optimal solution. This makes it vital for governments aiming to promote a socially responsible commercial society to reduce ESG investment costs. Policies could include tax incentives for ESG investment, such as green investment (E), donations (S), and recruitment of sustainability auditors (G), and the promotion of high ESG firms. Moreover, citizen education on ESG issues holds potential to foster more socially-responsible behavior among firms.

In conclusion, this study highlights the critical connections between ESG scores, scandal probabilities, and scandal returns. Although our findings are descriptive, they underscore the potential for causal exploration using rigorous identification strategies. Future research could extend our model to incorporate additional dimensions, including the impact of capital structure on reputation investment decisions and the nuanced influence of ESG preferences. This study also delivers a roadmap for future investigations (see “Firm’s optimal ESG investment” Section) and contributes to the ongoing discourse of the complex interplay between ESG considerations and financial outcomes.

Availability of data and materials

All the data we use is proprietary, so we cannot disclose it to the public. The codes will be provided upon request.

Notes

  1. Corporate social responsibility and ESG are highly-related concepts with some differences; for more on the topic, see https://blog.worldfavor.com/esg-vs-csr-what-is-the-difference.

  2. For technical brevity, we assume that the market has no prior belief on $$v$$, so the probability of ESG scandal is simply $$p\left(s\right)$$, where $$s$$ is the ESG score aggregated by various ESG ratings $${s}_{i}$$ on the market ($$i$$ indexes different rating agencies). The conclusions are qualitatively unchanged if there is a prior of $$v$$.

  3. To partially address multicollinearity problems, we conducted correlation analysis between our key explanatory variables and control variables. The results are presented in Appendix C, and the correlation coefficients were small in absolute terms.

  4. The ratio between these two sample sizes was 31% (40,045/127,478) and larger than 26% (average ESG scandal for the first sample) because some firms may have multiple scandals within 12 months. Such cases contribute multiple observations to the scandal return sample, but are underrepresented in the scandal probability sample because the dependent variable for this sample is a dummy.

  5. “Capital” here refers to broad-range production factors.

  6. See: https://bit.ly/4b9yXLg.

Abbreviations

ESG:

Environmental, social, and governance

E:

Environmental

S:

Social

G:

Governance

CSR:

Corporate social responsibility

PCA:

Principal component analysis

AT:

Total assets

CHE:

Cash and short-term investments

ACT:

Current assets

LCT:

Current liabilities

INTAN:

Intangible assets

EBITDA:

Earnings before interest

DD1:

Long-term debt due in one year

DT:

Total debt including current

SALES:

Sales

OANCF:

Operating activities net cash flow

CAPX:

Capital expenditures

DVT:

Total dividends

NI:

Net income

ROA:

Return on assets

CAR:

Cumulative abnormal return

WVS:

World Values Survey

LASSO:

Least absolute shrinkage and selection operator

References

Download references

Acknowledgements

Not applicable.

Funding

The study was funded by Hong Kong Polytechnic University under grand number P0047740.

Author information

Authors and Affiliations

Authors

Contributions

WS: Data preprocessing, data analysis, writing. YL: Data analysis, writing. SMY: Conceptual framework, writing. LY: Data download, data preprocessing. WD: Conceptual framework, theoretical modeling, data analysis, writing.

Corresponding author

Correspondence to Wenzhi Ding.

Ethics declarations

Competing interests

Authors declares that they have no competing interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Variable definition

This table describes the definition and data source of each variable we used in our analyses.

Variable

Definition

Source

Dependent Variables

  

ESG Scandal

Dummy equals one if there is an ESG scandal event in the next twelve months

RepRisk

ESG Scandal Returns

Cumulative abnormal returns for ESG scandal events. The estimation window is [− 250,− 20] trading days and the event window is [− 10,10] trading days

CRSP, RepRisk

ESG Scores

  

ASSET4

ASSET4 ESG score

ASSET4

S&P

S&P global ESG score

S&P Global

Sustainalytics

Sustainalytics ESG score

Sustainalytics

KLD

KLD ESG score

KLD

PCA

Principal component of ASSET4, S&P, Sustainalytics, and KLD

 

ESG Disagree

The standard deviation of ASSET4, S&P, Sustainalytics, and KLD

 

Average

Mean of ASSET4, S&P, Sustainalytics, and KLD

 

Company Fundamentals

  

Size

Natural logarithm of total assets plus one

\({\text{size}} = \log \left( {{\text{ACT}}} \right) + 1\)

Compustat

Cash Ratioa

Cash and short-term investments (CHE)

divided by total assets (AT)

\(Cash\;\,\,ratio = \frac{CHE}{{AT}}\)

Compustat

Current Ratio

Current assets (ACT) divided by current liabilities (LCT)

\(current\;\,\,ratio = \frac{ACT}{{LCT}}\)

Compustat

Intangibility

Intangible assets (INTAN) divided by total assets (AT)

\(Intangibility = \frac{INTAN}{{AT}}\)

Compustat

ROA

Earnings before interest

(EBITDA) divided by total asset (AT)

\(ROA\left( {Return\;on\;\,\,Assets} \right) = \frac{EBITDA}{{AT}}\)

Compustat

CAPEX

Capital expenditures (CAPX) divided by total asset (AT)

\(CAPEX\left( {Capital\;\,\,Expenditure} \right) = \frac{CAPX}{{AT}}\)

Compustat

Maturing Debt

Long-term debt due in one year (DD1)

divided by current asset (ACT)

\(Maturing\,\,\;Debt = \frac{DD1}{{ACT}}\)

Compustat

Leverage

total debt including current (DT) divided by total assets (AT)

\(Leverage = \frac{DT}{{AT}}\)

Compustat

Dividend Payout

Total dividends (DVT) divided by net income (NI)

\(Dividend\;Payout = \frac{DVT}{{NI}}\)

Compustat

Sales Growth

The compound growth rate of last 5 years' sales

\(Sales\;\,\,Growth = {\left( {\frac{SALES}{{SALE{S_{5\;\,years\,\,\;ago}}}}} \right)^\frac{1}{5}}\; - 1\)

Compustat

Cash Flow Vol

The standard deviation of last 5 years' Operating Activities Net Cash Flow (OANCF)

\(Cash\,\,\;Flow\,\,\;Volatility = \sigma \left( {OANCF\,\,\;of\,\,\;last\,\,\;5\,\,\;years\;} \right)\)

Compustat

  1. aPlease note that CHE/LCT also makes sense since it captures the ability of a firm to use its cash to repay its current debt, which is a common practice in academia. CHE/AT provides useful information on cash holding since not all cash is prepared for repaying short-term debt. Cash can also be used for more general purposes like capital investment, financial policy flexibility, etc. This is also common in academia, e.g., Palazzo (2012), Cao et al. (2021), and Ding et al. (2021). We try using CHE/LCT to measure cash holding, and the result also holds

Appendix B. ESG scores

B.1: ESG scores history

ASSET4 was acquired by Thomson Reuters in 2009, but the ESG data was made available under the old name of ASSET4. After the acquisition, the name was changed to Thomson Reuters ESG Scores. However, since the name ASSET4 is widely known, we continue to use it for simplicity. It is important to note that as of 2018, the ESG ratings data of Thomson Reuters is now part of Refinitiv and is also known as Refinitiv ESG.

In 2017, Morningstar acquired approximately 40 percent stake in Sustainalytics, and later in 2020, they purchased the remaining approximate 60 percent of Sustainalytics equity.

The data from KLD originates from Kinder, Lydenberg, and Domini (KLD) Inc., which was acquired by Riskmetrics in 2009. In 2010, MSCI acquired Riskmetrics. For the purposes of this paper, we refer to this data as either KLD or MSCI KLD. Eccles et al. (2020) provide detailed information on the history of KLD.

B.2: Research on ESG score qualities

With the increasing availability of ESG ratings from different raters, the disagreement among ESG rating providers has become a concern for investors. As evidenced by Chatterji et al. (2016) and Berg et al. (2022), the disagreement among ESG raters is substantial, with the correlation between ESG ratings ranging from 0.38 to 0.71 as shown by Berg et al. (2022). Chatterji et al. (2016) first investigated the main drivers for the divergence, they provide two reasons for such divergence: "theorization" and "commensurability". They evidenced that both differences in theorization and commensurability will have an impact on ESG rating divergence. Berg et al. (2022) extended the study to analyze the extent of the impact of each source that drives the divergence. They distinct the ESG rating divergence into three sources: "Scope divergence" comes from the situation where ratings are calculated based on different sets of attributes. "Measurement divergence" emerges when different rating providers measure the same attribute by distinct indicators. "Weight divergence" appears when rating agencies give different importance to attributes. Berg et al. (2022) documented the possibility of estimating the aggregation rule used by ESG rating agencies and showed an accuracy of 79–99%. Berg et al. (2022) also ranked the contributions to ESG divergence of the above-mentioned three sources: the measurement divergence contributes the most (56%), scope divergence contributes the second (38%), and weight divergence is the last meaningful, contributing merely 6%. Berg et al. (2022) further show that the measurement divergence is in part driven by the "rater effect", also known as the "halo effect", where a high score in one category would induce high scores in other categories from the same rater, it showed that 15% variation of category scores can be explained by rater effect when controlling firm and category.

Capelle-Blancard and Petit (2019) found negative market reactions to negative ESG news. Serafeim and Yoon (2023) analyzed how ESG rater disagreement affects the predictive ability of ESG ratings. They documented that the predictive value of ESG consensus will be weakened by the presence of significant disagreement. By examining the three components of disagreement proposed by Berg et al. (2022), they find that the predictive ability of consensus rating diminishes for firms with large measurement divergence, and such findings are not evident for scope divergence and weight divergence.

The ESG scores are still under development. The above-mentioned literature studies the quality of previous version of ESG scores. Whether these conclusions can be applied to the next generation of ESG scores is still not clear and calls for more research.

Appendix C: Correlation

This table presents the correlation between the explanatory variables and other control variables.

 

Average ESG score

Scandal return analysis

Scandal probability analysis

ESG disagree

− 0.21***

0.28***

Size

− 0.15*

− 0.08

Cash ratio

0.12

0.01

Current ratio

0.09

0.01

Intangibility

0.04

− 0.01

ROA

− 0.06

− 0.01

CAPEX

− 0.04

0.01

Maturing debt

− 0.02

− 0.02

Leverage

− 0.03

− 0.01

Dividend Payout

0.01

0.02

Sales growth

− 0.08

− 0.07

Cash flow Vol

− 0.03

− 0.02

N

40,045

127,496

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, W., Luo, Y., Yiu, SM. et al. ESG scores, scandal probability, and event returns. Financ Innov 10, 121 (2024). https://doi.org/10.1186/s40854-024-00635-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40854-024-00635-1

Keywords

JEL Classification