1. Introduction
Market crises and accounting scandals like Black October 1987, the 1997 Asia financial turmoil, and the subprime mortgage debacle in the U.S. have received tremendous attention from academia and practitioners, as is evident from current financial literature [
1]. These extraordinary and unusual events have had fatal side effects or caused damage not only to the respective financial markets, but also to societies as a whole. Due to such practical significance, a considerable amount of research has looked into corporate financial crisis prediction [
2,
3].
Preliminary works on financial pre-warning models were frequently based on statistical approaches. Although statistical approaches have satisfactory prediction capabilities, they also confront numerous challenges, such as linear separability, multivariate normality, and independent predictor variables, which do not hold up in real-life applications. With the great improvement in information technology, numerous artificial intelligence (AI) techniques that do not obey strict statistical assumptions have been introduced to solve the aforementioned problems [
4,
5,
6]. In contrast to well-examined issues such as credit rating forecasting and financial crisis prediction, research on corporate operating performance forecasting is rather scant. It is, however, widely acknowledged that the main cause of financial crises on the corporate end is poor management, and operating performance is a suitable and reliable proxy for representing the outcome of corporate management [
7,
8,
9]. Kamei [
10] also stated that 99% of financial crises are due to bad operating performance—that is, the decline in operating performance can be viewed as a prior stage before a financial crisis bursts forth.
Return on assets (ROA) and return on equities (ROE) are the most commonly executed performance measures for a firm [
11,
12]. As these measuring criteria are categorized into one input variable and one output variable, they cannot suitably represent the whole facet of a corporation’s operations in a turbulent economic atmosphere. Data envelopment analysis (DEA) can deal with this obstacle, as it can handle multiple input and multiple output variables without an a priori assumption of profit maximization or cost minimization [
13]. It can also be extended into different modifications and provide a synthesized outcome, thus offering adequate flexibility for practical applications [
14,
15,
16]. How to precisely measure a corporation’s operating performance and further construct a reliable forecasting model are notable topics for decision makers [
17].
Stein [
18] stated that lower capital constraints may be one possible avenue to enhance corporate operating performance, because these constraints have a direct impact on a corporation’s ability to undertake major investment decisions and capital structure selections. Benabou and Tirole [
19] revealed that a firm with superior corporate social responsibility (CSR) performance can end up with lower capital constraints. The reasons can be summarized as follows. First, better CSR performance is connected to superior stakeholder engagement, which can eliminate short-term risky and opportunistic investment strategies and, as a consequence, decrease whole contracting costs [
20]. Second, a firm with better CSR performance, it can spread positive messages to all market participants to signal its long-term focus [
19,
21]. Finally, better CSR performance is also linked to data accountability and transparency, thus alleviating the obstacle of information asymmetry between managers and investors and leading to lower capital constraints [
22,
23,
24,
25,
26].
Friedman [
27] conversely viewed CSR as an agency problem and indicated that CSR has a negative effect on operating performance due to corporate costs from CSR. Based on the agency cost theory, Brown et al. [
28] stated that top executives may benefit themselves utilizing their corporations’ inherent resources through philanthropy while shareholders incur a loss by such spending on charity. Barnea and Rubin [
29] also indicated that top executives overinvest in CSR to build up their personal reputation when agency costs are incurred. Despite the large amount of concentration, a basic question remains unanswered: Does CSR lead to value creation (i.e., enhance corporate operating performance)? If so, in what ways? The extant studies so far have failed to give a conclusive answer [
30].
Cavaco and Crifo [
31] stated that one possible reason for this absence of consensus lies in the fact that CSR policy is multi-dimensional and consists of social, environmental, and business behavior facets. Merely implementing a singular item as a proxy for generic CSR could lead to a confused result about the relationship between CSR and operating performance [
32]. Obviously, there is an urgent requirement to break down CSR among dissimilar dimensions so as to realize its possible differential influence on corporate operating performance [
33,
34]. To deal with the aforementioned challenge, the topic extraction algorithm, called latent Dirichlet allocation (LDA), was formed [
35]. After executing the LDA algorithm, one can extract the informative and interpretable topics/dimensions from CSR reports and then examine the effectiveness of each topic/dimension on operating performance.
The valuable topics/dimensions extracted by LDA and performance rank determined by DEA can then be injected into an extreme learning machine (ELM) (one of the neural network-based (NN-based) techniques) to construct the forecasting mechanism. In contract to a traditional NN-based algorithm, an ELM not only reaches superior generalization ability with an efficient calculation speed (i.e., its input weights and hidden biases are arbitrarily decided, and the output weights are computed by performing a Moore–Penrose (MP) generalized inverse), but also avoids many troubles in manually tuning inherent parameters, such as stopping criteria, learning rate, learning epochs, and local minima [
36,
37,
38].
To our knowledge, extant works have not yet established a forecasting model for operating performance that simultaneously integrates numerical information from financial reports and textual information from CSR reports. This study thus further decomposes CSR into numerous dimensions to realize the effect of each dimension on performance forecasting. By doing so, top executives can realize which dimension has a great effect on operating performance and then allocate valuable resources to suitable places to reach the goal of sustainable development.
We believe that the results herein contribute to the corporate finance and CSR literature in several ways. First, this study proposed a novel mechanism to forecast operating performance, which is widely acknowledged as the prior stage to any financial crisis. Second, rather than simply investigating the influence of generic CSR on corporate operating performance, we broke down CSR into numerous dimensions by a topic extraction technique and then examine the effectiveness of each dimension/topic on corporate operating performance. As the CSR dimension was determined by a data-driven technique (i.e., that is, a topic extraction technique tries to let words speak), the result is less reliant on human involvement and more objective. The CSR measures in this study were totally different from previous studies that focused on developing the measurement of a theoretical construct. The study provides a different viewpoint to discuss the impacts of CSR on performance. Finally, employing a Taiwanese database was appealing, because most practical works on CSR and corporate operating performance assessment have focused on developed economies (such as corporations in the U.S. and EU) with less agency cost between manager and shareholders, and, thus, a positive effect of CSR on performance is generally supported. Firms in developing economies (such as Taiwan) and developed economies have distinctive differences in organizational structures and behaviors [
39,
40]. Lin [
41] also stated that the ownership structure in Taiwan is less dispersed than it is in the United States and United Kingdom—that is, the agency problem in developing economies is much more severe than in developed economies. Thus, this study used results in Taiwan to investigate the relationship between CSR and performance.
This paper is organized as follows. It starts with research methodologies adopted for this study, followed by research design, data illustration, and practical results. This paper closes with conclusions and implications for future research.
2. Methodologies
2.1. Data Envelopment Analysis
Introduced by Charnes et al. [
42], DEA is a widely used economic analysis quantitative approach to assess the relative efficiency of a homogeneous set of decision-making units (DMUs) with multiple input and output variables. A performance comparison of the DMUs is conducted through an assessment of the differences in their input and output variables [
43]. An input index can utilize consumption, such as economic resources, and labor hours; an output index represents output outcomes, such as sales revenues, and productivity. By integrating the input and output indices, we can make a further comparison of each DMU’s economic benefit.
The synthesized index is set up by assigning a weight to these individual indices and then aggregating the weights. The weights of each input variable and output variable are calculated by optimizing the relative efficiency. This is the reason why DEA can handle a problem with uncertain weights of input and output indices and one that contains a large number of objectives in each DMU’s performance assessment. The BCC (Banker–Charnes–Cooper) DEA model was executed in this study as the corporate operating condition often achieves variable return to scale (VRS), due to imperfect competition, government regulation, and financial constraints. In addition, an output orientation was conducted because once a corporate financial investment decision has been made, it is very complicated to disinvest to save the costs by amending input variables, thereby invalidating the output orientation.
The output-oriented BCC DEA model evaluates the relative efficiency of
corporations (
). Every
uses
inputs (
) and generates
outputs (
). The relative performance score of
can be described as follows:
where
is the performance score of corporation
;
denotes the
-th outputs of the
-th DMU;
represents the
-th inputs of the
-th DMU;
describes a weight of
-th output of corporation
and
denotes a weight of
-th input of corporation
. In addition,
expresses the extremely small positive number to make all
positive;
denotes intercept. Based on the above-mentioned model, the optimal input/output multiplier can be decided [
44].
2.2. Topic Modeling: Latent Dirichlet Allocation (LDA)
Latent Dirichlet allocation (LDA) is an unsupervised probabilistic generative model for describing the latent topics of documents [
35]. It models each document as a distribution over topics and each topic as a distribution over words. These distributions are followed by the Dirichlet distribution. Let
and
represent the set of users and the bag of words determined by a user
, respectively. Let
be the set of specific words showing up in a bag of words
at least once for a user
. Here,
expresses the set of latent topics, and the volume of topics is given as a parameter. In LDA’s generative procedure, each user
has his/her own preference over the topics expressed by a probabilistic distribution that is a multinomial distribution over
, denoted by
, and each topic
also has to follow a multinomial distribution over
, expressed by
.
Figure 1 expresses the generative procedure of LDA, represented in a graphical style. In
Figure 1, circles are implemented to represent the random variables or model parameters in the LDA model. Shaded circles denote the variables or parameters that are observable in the dataset, and the white circles express the unobservable ones that have to be estimated by inferencing mechanisms. The arrows depict the dependencies between variables. We express LDA’s generative procedures below.
For each topic .
Draw a multinomial distribution
For each user .
Draw a multinomial distribution
For each word
(a) Draw a topic
(b) Draw a word
In the LDA model, the multinomial distributions
and
are sampled from the Dirichlet distribution, there is one kind of conjugate prior distribution, and the inherent parameter of
and
are
and
, respectively. Each word
in
is presumed to be chosen by first drawing a topic
, following the topic preference distribution
, and subsequently selecting a word
from the corresponding distribution
of the selected topic
. According to the definition of the LDA model, the following equation estimates the probability that a word
is constructed by a user
.
2.3. The Proposed Model: LDA_ELM_SA
This study introduced an emerging hybrid mechanism, the LDA_ELM_SA model, for operating performance forecasting, which consisted of four main procedures: (1) establishing a CSR-related corpus by a text mining (TM) technique, (2) extracting CSR’s multi-dimensional characteristics by LDA, (3) determining the important features by fuzzy rough set theory (FRST), and (4) constructing a forecasting model by ELM with the self-adaptive mechanism.
• Procedure 1:
In order to handle a tremendous amount of CSR textual information and further condense it into manageable factors, a CSR-related corpus should first be decided. To reliably and soundly establish the CSR-related corpus, we had to gather CSR reports from two different categories: (1) corporations that have received a CSR award (that is, those with good CSR performance), and (2) corporations that have not received a CSR award, or corporations that have violated CSR rules, been fined for environmental pollution, legal problems related to abusing employees, or some social issues (that is, those with bad CSR performance). The CSR award is announced by CommonWealth Magazine, which is one of Taiwan’s most prestigious publishing groups, and is based on four core subjects: corporate governance, enterprise commitment, community involvement and development, and environmental protection.
Li [
46] indicated that financial reports contain a large number of optimistic terms that could represent a corporation’s chances of gaining profit in the following years. Based on this concept, we collected CSR reports from two categories and then performed Chi-square to identify the representative special terms from two distinctive categories—that is, the terms in two different categories are quite different. How to quantify the importance of each special term is another interesting topic for users. The entropy technique, which has been commonly performed in computational linguistic analysis as a measure of the relative weights among all special terms, was conducted. As corporate CSR reports are guided by Global Reporting Initiatives (GRI), terms used in the reports are very specific and precise. Thus, the result from the entropy method is reliable.
• Procedure 2:
In order to deal with the nature of CSR’s multi-dimensional characteristics, LDA was performed. LDA is a generative model that can be implemented to a corpus of documents made up of categorical data [
35,
47,
48]. In this algorithm, each document is expressed as a random mixture of latent topics, which are multinomial distributions over the unique words in a corpus [
49]. The generative procedure for a document includes determining a topic distribution that satisfies the corpus-level Dirichlet distribution. Sequentially, for each word embedded into the document, a topic is decided and a word is picked up from the corresponding multinomial distribution. After executing the LDA algorithm, we could decompose CSR into some essential dimensions and further match the preserved dimensions with a pre-determined CSR corpus so as to condense a large amount of textual information into manageable elements. By doing that, we could examine which dimension had a great effect on performance.
• Procedure 3:
The collected information was usually full of irrelevant, redundant, and useless material, which in most situations would mislead the research findings, obtain biased results, and increase computational burdens. Feature subset selection is an inevitable prior stage before forecasting model construction. The purpose of feature selection is to remove redundant or useless features without significantly deteriorating the forecasting performance of the model constructed merely by the selected feature subsets being as close as possible to the original class distribution [
50]. Rough set theory (RST) was proposed by Pawlak [
51] as a tool for handling data with uncertainty, vagueness, and incompleteness and has been applied successfully to feature subset selection and rule induction in numerous research fields [
52]. However, the classical RST is defined with equivalence relations that impede its capability at handling data with numerical or fuzzy values [
53]. Discretization of the numerical data is one avenue to solve this challenge, but it may result in the problem of information loss. In order to cope with categorical and numerical data in datasets, fuzzy rough set theory (FRST) was introduced by Dübois and Prade [
54] through a combination of RST and fuzzy set theory (FST) together. In this approach, a fuzzy similarity relation is executed to assess the similarity between samples. The fuzzy upper and lower approximations of a decision are then determined by utilizing the fuzzy similarity relation [
55]. The fuzzy positive region is denoted as the union of the fuzzy lower approximations of decision equivalence classes. As the fuzziness is introduced to RST, more valuable information of real-valued data is easily preserved. Feature subset selection with FRST has gained considerable attention in recent years [
56]. Thus, FRST was conducted in this study. For a more detailed illustration of FRST, please refer to Jensen and Shen [
57].
• Procedure 4:
The selected features by FRST were finally executed to construct the corporate operating performance forecasting model. In traditional ELM, the number of hidden nodes is pre-decided, and the hidden node parameters are arbitrarily decided and remain unchanged during the training process, resulting in the existence of non-optimal nodes and a failure to reach the aim of cost function minimization [
58]. In order to overcome the aforementioned problems, ELM with a self-adaptive (ELM_SA) mechanism was proposed. In ELM_SA, the hidden node learning parameters are optimized by a self-adaptive differential evolution approach, whose trial vector construction strategies and their associated control parameters are self-adaptive in a strategy pool by learning from their prior experiences in providing a promising solution, and the network output weights are computed via the Moore–Penrose (MP) generalized inverse. To ameliorate the impact of over-fitting, five-fold cross-validation was conducted.
Figure 2 depicts the flowchart of the proposed LDA_ELM_SA model.
2.4. The Empirical Question
After going through the aforementioned procedures (see
Figure 2), we introduced the model. The performance status (i.e., dependent indicator) was determined by DEA. The independent indicators were mainly collected from two parts: (1) quantitative predictors derived from financial statements, and (2) qualitative predictors collected from CSR reports. Our study examined whether the CSR textual reports contained supplementary information beyond what numerical indicators could reveal. Therefore, we proposed the following hypothesis.
Hypothesis 1 (H1). The performance forecasting model with textual indicators has higher forecasting capability than the traditional numerical models.
4. Conclusions and Implications
Prior related works have yielded non-conclusive results regarding the relationships between corporate operating performance and corporate social responsibility. Those conflicting results may be attributed to two differences in measuring financial performance and CSR status [
32,
79]. Accounting-based ratios, such as ROA and ROE, are the most commonly implemented to determine corporate operating performance, but they are categorized into the domain of one input variable and one output variable. However, employing only one input variable and one output variable to determine such performance is not reliable in this highly fluctuating economic atmosphere. To overcome the obstacle, this study executed DEA as it can handle multiple input variables and multiple output variables simultaneously.
A firm’s CSR policy is multi-dimensional and includes numerous aspects, such as environmental, business, and social factors [
79]. Performing merely one general indicator to depict a corporation’s CSR status could cause some uncertainties. Thus, to better understand the relationship between CSR and corporate financial performance, this study opened up the black-box of CSR (i.e., breaking it into numerous dimensions by LDA) and examined the impact of each CSR dimension on corporate financial performance. Doing so allows top executives to allocate limited resources to specific CSR dimensions that are highly related to corporate financial performance.
In contrast with well-studied works such as financial crisis prediction, research on corporate operating performance is quite rare. It is widely deemed that a corporation with worse operating performance could be at the stage before a financial crisis bursts out. If top executives can receive this red flag signal (that is, crisis warning signal) earlier, then they can initiate a modification of their firm’s capital structure and adjust investment strategies so as to enhance its risk absorbing ability. To our knowledge, there is no study in the literature that has introduced a model for corporate operating performance forecasting that simultaneously considers multiple dimensions of CSR broken down by LDA and uses DEA as the performance measure. To fill this gap in the CSR and corporate finance literature, this study proposed an emerging corporate operating performance forecasting model that consists of multiple dimensions of CSR, feature subset selection by FRST, and model construction by ELM_SA. The proposed model, examined through real cases, is a promising alternative for corporate operating performance.
This study had some limitations that could be addressed in future studies. First, we worked on the target sample of Taiwan’s electronic industry, which suggests that the ability to generalize the results is limited. Second, future research can enhance the model’s forecasting quality by integrating much more sophisticated structures, such as ensemble learning and stacking [
80,
81]. Finally, recent related works suggest that the debate concerning corporate operating performance and CSR status should be taken further by including additional intermediate variables or the variables derived from theoretical constructs that can enhance our realization of the procedures through which CSR influences corporate operating performance [
72,
73,
76,
82].