Next Article in Journal
Interventional Fairness with Indirect Knowledge of Unobserved Protected Attributes
Next Article in Special Issue
Information Flow Network of International Exchange Rates and Influence of Currencies
Previous Article in Journal
Human Synchronization Maps—The Hybrid Consciousness of the Embodied Mind
Previous Article in Special Issue
The Relationship between Crude Oil Futures Market and Chinese/US Stock Index Futures Market Based on Breakpoint Test
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Connectivity Analysis for Multivariate Time Series: Correlation vs. Causality

Department of Economics, University of Macedonia, 54636 Thessaloniki, Greece
Entropy 2021, 23(12), 1570; https://doi.org/10.3390/e23121570
Submission received: 24 October 2021 / Revised: 17 November 2021 / Accepted: 24 November 2021 / Published: 25 November 2021
(This article belongs to the Special Issue Granger Causality and Transfer Entropy for Financial Networks)

Abstract

:
The study of the interdependence relationships of the variables of an examined system is of great importance and remains a challenging task. There are two distinct cases of interdependence. In the first case, the variables evolve in synchrony, connections are undirected and the connectivity is examined based on symmetric measures, such as correlation. In the second case, a variable drives another one and they are connected with a causal relationship. Therefore, directed connections entail the determination of the interrelationships based on causality measures. The main open question that arises is the following: can symmetric correlation measures or directional causality measures be applied to infer the connectivity network of an examined system? Using simulations, we demonstrate the performance of different connectivity measures in case of contemporaneous or/and temporal dependencies. Results suggest the sensitivity of correlation measures when temporal dependencies exist in the data. On the other hand, causality measures do not spuriously indicate causal effects when data present only contemporaneous dependencies. Finally, the necessity of introducing effective instantaneous causality measures is highlighted since they are able to handle both contemporaneous and causal effects at the same time. Results based on instantaneous causality measures are promising; however, further investigation is required in order to achieve an overall satisfactory performance.

1. Introduction

There are various challenges in the analysis of multivariate high-dimensional systems, such as in the analysis of financial and neurophysiological data. The goal of each application and the features of the examined data should be considered in order to determine the suitable connectivity analysis scheme. For example, financial time series are nonstationary, contain nonlinearities and exhibit volatility clustering, whereas data in neuroscience experiments may present a high temporal resolution, be subject to artifacts, periodic respiratory or cardiac noise.
Connectivity analysis focuses on identifying the interdependence relationships of the variables of a complex system. Connectivity measures can be subdivided into two main categories based on whether they quantify the direction of a relationship. Nondirectional measures assume that variables evolve in synchrony and the connectivity is examined based on symmetric measures, such as dependence measures. On the other hand, directed measures quantify the causal effects among the variables, assuming that causes precede their effects in time, such as Granger causality measures [1]. A further subdivision of both categories differentiates between model-based and model-free connectivity measures. Both categories of measures consist of measures calculated in the time, frequency or phase domain. Spectral measures of dependence infer the dependence between oscillatory components of the examined data.
Since there is an abundance of connectivity measures that have been developed so far, there is also an urgent need to compare them and clarify the usefulness of each method. Comparisons are mainly performed in terms of applications of interest. Indicatively, comparisons of correlation measures can be found in [2,3,4,5,6,7,8,9,10,11], among synchronization measures in [12,13,14,15,16,17] and among causality measures in [16,18,19,20,21,22,23,24,25,26]. However, the bibliography lacks comprehensive connectivity evaluations, including both correlation and causality methods. Causality is either solely utilized or supplementary to correlation, i.e., correlation is a sign of causality, although it is not sufficient to infer causality [27,28,29,30].
There are a number of limitations and common pitfalls regarding the connectivity measures that complicate the correct inference of the connectivity of a system. For example, correlation measures are usually assuming linearity and do not handle outliers, are affected by the sample size and there are also interpretational mistakes when discussing their outcomes [31,32,33,34,35]. Some limitations relevant to causal measures are the sample size bias, the common input problem, the effect of noise and the curse of dimensionality [36,37,38,39,40,41,42]. We will briefly discuss these problems below.
The purpose of this paper is to provide a brief review on connectivity analysis, enlisting the most well known connectivity measures and identifying the possible pitfalls when forming connectivity networks. Selecting between correlation and causality measures for inferring the connectivity structure of an examined system is of the key interest of this study; therefore, it is essential to demonstrate the performance and pitfalls of the different connectivity measures focusing on those that have not been stressed and investigated extensively so far. In particular, we examine the influence of the existence of contemporaneous interdependencies to the extracted causal network, the effect of causal relationships when forming correlation networks and finally, inquire the case of having both contemporaneous and causal relationships among the variables of a multivariate system.

2. Non-Directional Connectivity Measures

The examination of the dependencies among the variables of a complex system is an essential task in statistics. Non-directional, symmetric connectivity measures aim to capture and quantify the strength of associations between the variables of a complex system, without indicating the direction of the relationship. Correlation measures are utilized for the characterization of statistical relationships.
The most commonly used correlation measure is the Pearson correlation coefficient, also known as the Pearson product-moment correlation coefficient, or bivariate correlation or correlation coefficient [43]. It is expressed as a normalised measurement of the covariance and it can only account for the linear relationships of the variables. It is not robust in case of outliers and is sensitive to the data distribution.
Many alternatives to the correlation coefficient have been developed, aiming to address its limitations. Rank-based analogues of the correlation coefficient do no make any assumptions about the frequency distribution of the variables and do not assume that a linear relationship between the variables exists, whereas they can be used both for variables measured on interval scales but also at ordinal levels. Spearman’s rank correlation coefficient [44] and Kendall’s rank correlation coefficient [45] are the most common non-parametric correlation alternatives to the correlation coefficient.
Partial correlation measures quantify the degree of association between two variables while controlling for the effect of any additional variables of the examined system. Their main contribution is that they can uncover spurious relationships and detect hidden relationships. The most common partial measure is the linear partial correlation coefficient, which can be estimated based on the bivariate linear correlation coefficients of the examined variables. Its nonparametric analogue is the partial Spearman’s rank correlation coefficient.
Hoeffding’s test of independence is a test of correlation for two variables with continuous distribution functions [46]. Multivariate Hoeffding’s phi-squared test is an extension of Hoeffding’s test of independence that expressed the multivariate association [47]. Biweight midcorrelation is a measure of similarity between samples. It is median-based and, therefore, is more robust to outliers than Pearson’s correlation.
The coefficient of determination is based on linear regression models [48] and is equal to the squared value of the correlation coefficient. Pearson’s correlation can be zero for dependent random variables. Distance correlation was developed aiming to face this limitation, i.e., a zero distance correlation implies independence [49,50]. Later, partial distance correlation was developed with methods for dissimilarities [51].
The odds ratio quantifies the dependence between two binary variables and ranges from zero to infinity. Yule [52,53] introduced two normalized versions of the odds ratio, namely the Yule’s Q and the coefficient of colligation or Yule’s Y, respectively. Various different extensions have been defined subsequently, such as Digby’s coefficient H [54] and coefficient Y * [55].
Copulas are used to model the dependence between random variables. For example, the Randomized Dependence Coefficient measures the dependence between multivariate random variables [56]. Copula correlation is a copula-based measure exploiting kernel density estimators [57]. A class of copula-based dependence coefficients have been developed, which are computationally efficient and can handle a wide range of associations [58,59,60,61,62,63].
The continuous analysis of variance test (CANOVA) is a nonlinear correlation measure defined by the neighborhoods of the data points of two continuous variables [64]. The rationale of CANOVA is that neighboring values of the first variable should lead to correlation of neighboring values of the second variable.
An ensemble of information-theoretic alternatives to the linear correlation coefficient aim to account for nonlinearities and temporal dependencies in the data. Mutual information is used as a generalized correlation measure that quantifies nonlinear associations [65,66]. Mutual information is also considered to form the Mutual Information Matrix for the study of nonlinear interactions in multivariate time series [67]. Time-delayed mutual information takes into account the lag difference of the variables [65].
Nonlinear correlation information entropy is specified by the rank sequences that are obtained from the original data sets [64]. The entropy correlation coefficient [68] and the entropy coefficient of determination [69] were utilized due to their desirable properties, such as being easily explicable and able to express nonlinear associations. Maximal information coefficient has been utilized for the detection of linear and non-linear relationships in large data sets [70]. The partial maximal information coefficient captures the association between two variables, removing the effect of a third random variable [71]. The time-delayed mutual information of the phase has been introduced for determining nonlinear synchronization in electrophysiological data [72].
Time domain measures express the variation of amplitude of signal with time. Frequency domain analysis of signals is performed in reference to frequency, rather than time. Coherence function quantifies linear correlations in the frequency domain [73]. Cross-coherence is the equivalent measure of cross-correlation in the frequency domain [74]. Partial coherence has been defined in order to quantify the strength of a conditional relationship between two neurons and is suitable for multivariate analysis by using the predictor to distinguish direct connection from common inputs [75]. As an extension, multivariate partial coherence analysis has been introduced [76].
Phase synchronization stems from the notion of synchronization of chaotic oscillators, whereas the global diversion of the phases of the signals is examined [77,78]. Different measures of phase synchronization have been developed, such as the mean phase coherence [12,79].
Phase locking is a key notion in dynamical systems. Phase locking value is a measure robust to fluctuations in amplitude that quantifies the absolute value of the mean phase difference between two signals [12,78].
Recurrence analysis examines how close the states of a dynamical system are after some time [80]. Its extension includes the cross recurrence plots, which are the bivariate extension that examines the dependencies between two different systems [81,82]. Several measures stem from recurrence analysis, such as recurrence rate, determinism and maximal length of diagonal structures [83,84].
We should note here that most correlation measures are bivariate since extensions of them in the multivariate case most probably lead to directional measures, such as the conditional mutual information, which is an extension of mutual information that accounts for the remaining variables of the system which infers about the directionality of the information flow [85,86,87].
A list of some well known correlation measures in time and frequency domain is displayed in Table 1.
The different correlation measures have been vastly applied in different fields, such as in finance, neurophysiology, meteorology, biology and engineering. Applications include the identification of genomic associations [89,90], the examination of the association of proteins in the pathogenesis of Parkinson’s disease [91], noise reduction [92,93], identification of disease-specific biomarker genes [94], multimodal image registration [95], portfolio optimization [96,97], investment decisions [98], wind power combination prediction [99], genetic interactions [100], artificial neural network model development that concerns water treatment plants [101], testing tourism economies and islands’ resilience to the global financial crisis [102], electroencephalograms (EEG) analysis [103], the study of financial markets [104,105], recognizing multiple positive emotions by analyzing brain activities [106], identifying meteorological parameters that play a major role in the transmission of infectious diseases such as COVID-19 [107] and stock trend prediction [108,109].

3. Directional Connectivity Measures

Directional connectivity measures seek to infer the direction of the relationship from the data samples, relying on the principle that causes precede their effects. The most common procedure of causal discovery is Granger causality, where probabilistic causation relies on the concept that causes change the probabilities of their effects [1,110].
Model-based directional approaches assume the linearity of interactions. The standard linear Granger causality is the pioneer technique based on autoregressive models that seeks to determine whether prediction of the target (driven) variable can be improved by exploiting past values of the source (driving) variable [1].
Various model-based extensions of the standard Granger causality test have been developed so far. The conditional Granger causality is its multivariate extension that exploits all the available information of the observed data [111]. Partial Granger causality is an extension of conditional Granger causality developed to face the problem of exogenous inputs and latent variables [112]. Further parametric causality methods have been introduced, such as methods defined on radial basis functions [113], kernel functions [114] and nonlinear autoregressive exogenous models [115].
Non-parametric extensions of Granger causality to nonlinear cases in the time domain include the Baek and Brok test [116], the Hiemstra and Jones test [117] and the Diks and Panchenko test [118], and [119] extend the Hiemstra and Jones test in multivariate settings.
Numerous directional measures stem from information theory. These model-free approaches infer linear but also nonlinear interactions. Transfer entropy is the most well-known information measure for studying directed interactions [120]. Partial transfer entropy extends the bivariate transfer entropy to the multivariate case, where confounding variables are also considered in the estimations [121,122]. Some further information causality measures based on the nonuniform embedding scheme are the partial transfer entropy [123], (partial) mutual information on mixed embedding [124,125] and transfer entropy based on low-dimensional approximations of conditional mutual information [126,127].
Linear cross-correlation is the simplest and most well known synchronization measure defined as the ratio of covariance to root-mean variance of the two signals. Event synchronization is another simple and computationally efficient method that quantifies synchronicity and time delay patterns between signals [128].
Various nonlinear interdependence measures have been developed in regards of nonlinear prediction theory that use the neighborhoods of the reconstructed points of the state space aiming to determine the nonlinear driver-response relationships [129,130,131,132,133,134]. As an extension of the above, the (conditional) extended Granger causality further employs a linear model for all the points in the neighborhood of each reference point of the reconstructed state space [135]. A more recent bivariate causality method based on nonlinear state space reconstruction can be found in [136]. Empirical dynamic modeling (convergent cross mapping) is utilized for the definition of this measure. It has been introduced to inferring causality from complex systems that do not satisfy the separability assumption, i.e., when the cause and the effect are non-separable.
Graphical models have been suggested by [137] to account for probabilistic independence relationships between variables without relying on temporal information. Probabilistic graphical models are a combination of graph theory and the probability theory.
The field of causal discovery was signified by [138,139], where causal interpretation of the graphs was succeeded based on Bayesian network models. The PC algorithm is the pioneer structure-learning algorithm for directed graphs [138] under the assumption of the Causal Markov condition. After the introduction of the PC algorithm and the Fast Causal Inference [140], an ensemble of different causal discovery methods based on graphical models has been developed [141,142,143,144]. Markov discovery algorithms such as the PC algorithm cannot be directly used for the time series. Therefore, ref. [145] adapted the Fast Causal Inference algorithm for time series.
The Peter Clark momentary conditional independence algorithm is a causal discovery method that incorporates linear or nonlinear conditional independence tests to determine the causal networks from multivariate time series data [146]. This measure is designed for climate applications; therefore, it can handle strong interdependencies in the sample. An extension of this measure aiming to improve the computational efficiency is the Fast Approximate Causal Discovery Algorithm [147].
Data from time domain can be converted to the frequency domains with mathematical operators, such as the Fourier transform, which converts a time function into a sum or integral of sine waves of different frequencies. Causality measures from the frequency domain have been widely applied for the analysis of neurophysiological data. The majority of the developed spectral measures are based on linear models, and thus can only detect linear causal effects in the frequency domain, such as Geweke’s spectral Granger causality [111], the directed transfer function [148], the partial directed coherence [149], the direct Directed Transfer Function [150], the Generalized Partial Directed Coherence [151], the Phase Slope Index [152]. Recently, a frequency-domain approach for testing for short-and long-run causality has been introduced in [153].
Nonparametric methods have been also employed, such as the nonparametric approach based on Fourier and wavelet transforms in [154], the nonparametric partial directed coherence [155] and the DEKF-based extension of partial directed coherence, where the parameters of the time-varying autoregressive model are estimated using the Dual Extended Kalman Filter (DEKF) [156]. Further, the nonlinear partial directed coherence aims to model the nonlinear relationships of the examined time series using nonlinear models and generalized frequency response functions [157].
Granger causality relationships are examined by considering the past values of the involved variables. However, the prediction of the target variable, may at cases be improved by including the available current information of the source variable. In such a case, the instantaneous causality relation between the source and target variable should be considered [158]. For example, contemporaneous relationships are present if the regression residues of the data are correlated.
Within the framework of stationary autoregressive modeling, the instantaneous causality is usually tested by using Wald tests for zero restrictions on the innovation’s covariance matrix. Extended Granger causality accounting for zero-lag effects in the linear regression schemes implemented by the VAR model [159]. Instantaneous causality in presence of non constant unconditional variance is examined in [160]. Instantaneous causality measures defined on structural vector causal models are presented in [161,162,163,164].
A causality framework in frequency domain that considers instantaneous effects is introduced in [165]. An instantaneous measure of causality which is relying on the information versions of directed transfer entropy and partial directed coherence estimated after decomposing the coherencies and partial coherencies is presented in [166]. Instantaneous Granger causality measures based on the the Hilbert-Huang transform are introduced in [167].
Compensated transfer entropy is a nonlinear causality measure that regards contemporaneous relationships [168,169]. A multivariate Granger causality measure including instantaneous variables in the conditional set based on decomposition of conditional directed information is discussed in [170]. Partial mutual information from mixed embedding that considers also zero-lag effects, denoted as PMIME0, faces the problem of determining the connectivity network from multivariate time series in the presence of unobserved variables [171]. Finally, PCMCI+ is a causality measure based on conditional independence tests that searches for causal and contemporaneous parents in order to infer lagged and contemporaneous causal relationships.
A list of well known directional connectivity measures in time and frequency domain is displayed in Table 2.
The pioneer Granger non-causality test has been developed for analyzing financial data [1]; however, it is now vastly applied in various fields, such as for the analysis of magnetoencephalography (MEG) and electroencephalography (EEG) data [173,174]. Granger causality and its extensions, along with the alternative causality measures that have been developed afterwards, are vastly used in different applications. Among others, causality measures are utilized in financial applications, e.g., for the examination of the relation of stock markets [175,176], in neuroscience, e.g., for the analysis of brain structures and physiological time series [150,169,177], in seismology, e.g., for the analysis of earthquake data [178], in geoscience, e.g., for the discovery of weather and vegetation conditions on global wildfire [179], in meteorology, e.g., for modeling the air quality [180,181], and in epidemiology [182,183].

4. Limitations and Pitfalls of Connectivity Measures

The estimation of symmetrical and causal relationships from observational data has been vastly explored, along with the limitation and pitfalls of the corresponding measures [35,38,184,185,186,187,188,189,190]. Naturally, depending on the examined application, different additional issues may arise that should be addressed. For example, when analyzing electroencephalogram data, spurious functional connectivity may arise due to the common reference problem, i.e., as a result from the usage of a common reference channel. Therefore, connectivity measures that are sensitive to correlations at a zero time may give erroneous indications depending on the relative strength of the potential fluctuations at the recording and reference locations.
Linear model-based connectivity measures assume linearity of the relationships [191,192], whereas outliers can strongly affect them [191,192]. At cases, relationships can be linearised, by transforming the variables, e.g., by considering a logarithmic transformation. Alternatively, for monotonic nonlinear relations, rank-based measures can be utilized. If these solutions cannot be applied, then nonparametric and nonlinear measures are more appropriate. For example, rank-based measures and information-based measures are robust to outliers.
In general, measures of connectivity are biased, and, therefore, under the null hypothesis of no connectivity the estimates will be different from zero. Accurate estimation of connectivity measures requires sufficient sample sizes [193]. Guidelines for sufficient sample sizes have been presented for different scenarios [194,195,196], whereas solutions for different applications with small samples have been proposed [197,198,199].
Real data may entail various types of noise and noise levels; there are different data measurement methods that may entail measurement errors depending on the application. For example, noise in financial data may stem from small price movements and trading noises that illustrate heavy tails. The effect of noise on correlation measures has been examined in different studies [200,201,202]. The effect of noise on Granger causality analysis has been also examined. Due to noise, erroneous causality arises and true causality is suppressed when using the standard linear Granger causality test [36]. The nonlinear causality measures are generally more stable to the effect of noise than the linear ones [26,125].
There are various reasons for inferring spurious causal effects, such as due to unobserved variables, contemporaneous relationships, common inputs, synergetic and redundant influences and strong autocorrelations in the sample. Another difficulty in causal inferring is the discrimination between direct and indirect interactions when common inputs exist, although direct causality measures have been developed for this. Robust methods that can account for latent effects of unobserved variables are an open area of investigation in connectivity analysis [203,204,205,206,207,208].
The determination of causal directionality for contemporaneous links is an emerging area. An instantaneous causal effect can be interpreted as a zero-lag causality or as a symmetric causal relationship. However, it has been noted that instantaneous causality may arise in case of common sources and latent, unobserved variables [41,171].

5. Correlation vs. Causality

A plethora of connectivity measures have been briefly discussed above, along with some main pitfalls and limitations. However, a key question that arises is whether to apply symmetric correlation measures or directional causality measures to infer the connectivity network of an examined system. Since connectivity of real systems is unknown, the nature of the examined data and the performance of the connectivity measures are of great importance.
Therefore, we generate synthetic time series with known connectivity structures and demonstrate the efficacy of the connectivity measures in three different scenarios. In particular, we examine the influence of different types of dependencies in the samples to the efficiency of the connectivity measures. First, we consider a simulation system with only contemporaneous dependencies and explore the performance of the connectivity measures and in particular of the causality measures. The second simulation system demonstrates the effect of time-lagged directional relationships on the connectivity measures and in particular on correlation measures that are not defined in order to detect lagged dependencies. Finally, we consider a system with contemporaneous and time-lagged directional relationships and examine the performance of the connectivity measures. To better simulate real data which are usually non-normal, the noise terms of the considered stochastic simulation systems are not exclusively Gaussian, as usually assumed in the literature, but also skewed and non-symmetrical noise terms are regarded.
Based on the equations of each simulation system, 100 realizations with sample size n = 2000 are formed and different connectivity measures are computed. Specifically, we estimate the four correlation measures, four causality measures and two instantaneous causality measures. Let us examine the three-variate case, where known variables are X , Y and Z. The multivariate connectivity measures are similarly defined; however, instead of Z, an ensemble of conditioning variables Z = Z 1 , , Z K exists.
The aim of the study is to provide insights on the effectiveness of the different types of connectivity measures. Therefore, an indicative selection of measures is performed since it is impossible to include the ensemble of existing connectivity measures. The examined measures cover the most commonly used types of connectivity measures.
The considered correlation measures are the following ones:
  • Partial linear Pearson correlation coefficient (PPCor) = ρ X Y ρ X Z ρ Z Y 1 ρ X Z 2 1 ρ Z Y 2 , where ρ X Y = c o v ( X , Y ) σ X σ Y , c o v stands for covariance, and σ X and σ Y are the standard deviations of X and Y. Estimation of PPCor is performed based on “partialcorr” function from the Matlab Statistics Toolbox.
  • Partial Spearman rank correlation coefficient (PSpCorr), defined similarly to PPCor but on the series of the ranks. Estimation of PSpCorr is performed based on “partialcorr” function from the Matlab Statistics Toolbox.
  • Partial distance correlation (pdCor) is the extension of the distance correlation (dCor) in the multivariate case. The distance correlation of two random variables is obtained by dividing their distance covariance by the product of the distance standard deviations, i.e., d C o r ( X , Y ) = d C o v 2 ( X , Y ) d V a r ( X ) d V a r ( Y ) . Partial distance correlation is defined based on a Hilbert space where the squared distance covariance is defined as an inner product [51]. Estimation of pdCor is performed based on R codes given in [209].
  • Mutual information (MI) = H ( X ) H ( X | Y ) can be expressed on entropy terms, where H ( X ) is the Shannon entropy of the variable X. The k-nearest neighbors (KNN) estimator has been utilized for the estimation of MI [210].
The regarded causality measures are listed below:
  • Conditional Granger causality index (CGCI) is defined on the unrestricted and restricted vector autoregressive model (VAR) of order P, fitted to the time series of X: C G C I Y X | Z = l n S R 2 l n S U 2 , where the unrestricted model includes past terms from X , Y , Z variables, the restricted model omits the past terms of X variable and l n S R 2 , l n S U 2 are the residual variances of the corresponding VAR models. The Matlab code for the computation of CGCI can be found in https://github.com/dkugiu/Matlab/ (accessed on 23 October 2021).
  • Restricted conditional Granger causality index (RCGCI) is defined similarly to CGCI, however a modified backward-in-time selection method is used and a subset of lagged terms enter the unrestricted VAR model. Matlab codes for the computation of RCGCI can be found in https://users.auth.gr/dkugiu/ (accessed on 23 October 2021).
  • Partial transfer entropy on non-uniform embedding (PTENUE) measures the direct effect of Y on X in the presence of the “appropriate” past terms of all the variables w t = { w t X , w t Y , w t Z } : P T E N U E Y X | Z = I ( x t + 1 ; w t Y | w t ) , where x t + 1 is the future value of X one step ahead. Matlab codes for the estimation of PTENUE can be found in http://www.lucafaes.net/its.html (accessed on 23 October 2021).
  • Partial directed coherence (PDC) is based on VAR models as CGCI; however, it is defined in the frequency domain. For a frequency f, it is given as P D C Y X | Z ( f ) = | A 1 , 2 ( f ) | | A k , 2 ( f ) | 2 , where A ( f ) is the Fourier transform of the coefficients of the VAR model of order P and A i , j ( f ) is the component at the position ( i , j ) in the A ( f ) matrix. Matlab code can be provided upon request.
Finally, two instantaneous causality measures are assumed in this study:
  • Partial mutual information on mixed embedding (PMIME0) is an extension of the causality measure PMIME, that also contains zero lag terms. For the estimations, the Matlab code was provided by the authors [171].
  • Peter Clark momentary conditional independence algorithm (PCMCI+) addresses both lagged as well as contemporaneous causal discovery. Its an extension of PCMCI, which searches for causal parents based on conditional independence tests. The information-theoretic framework is considered here where the conditional mutual information is utilized as a general test statistic. Computations are performed using the python codes in https://github.com/jakobrunge/tigramite (accessed on 23 October 2021).
Standard free parameters and significance tests are utilized for each connectivity measure. In particular, the significance test for PPCor and PSpCorr is parametrically extracted based on the test statistic t = r n 2 1 r 2 , which follows the Student’s t-distribution with n 2 degrees of freedom. For the estimation of MI, we consider the KNN method where k = 10 neighbors. Regarding its statistical significance, it is assessed by randomly permuting the time series; p-values are then estimated from a one sided-test for the null hypothesis that two variables are independent. The number of permuted time series is set to be equal to 100. The significance of pdCor is also assessed using 100 permutations. As previously stated, the PDC is estimated for a range of frequencies in [ 0 , 0.5 ] (256 different frequencies). Significance is assessed parametrically as it is defined on VARs. The percentage of significant PDC values for each frequency is then examined. Finally, we display the percentage of significant PDC values over all frequencies and realizations, instead of displaying results for specific frequencies or frequency bands. The order of the VAR model for system 1 is set to be P = 1 , for system 2 we set P = 3 and for system 2 we set P = 3 .
A parametric significance test is employed for CGCI and RCGCI since these measures are also defined on VARs [153]. Order of VAR is set as noted for PDC above. The PTENUE and PMIME0 incorporate surrogates within their estimation algorithm and no significance test is required; positive values suggest the existence of causal effects, otherwise zero values are obtained. The free parameter L m a x for the lagged terms is equal to 4 for all systems. The significance level for the test for the termination criterion for PMIME0 is 0.05, whereas for PTENUE it is set to 0.01 For both measures, we set one step ahead that the mixed embedding vector has to explain, we consider 100 surrogates for the significance test and 10 neighbors (KNN estimator). Finally, the majority rule for handling ambiguous triples is assumed and the significance level is 0.05. Finally, local permutation tests are employed within the estimation procedure of PCMCI+ to determine the causal parents.

5.1. Simulation System 1

First, we consider a five-variate stochastic nonlinear simulation system where by construction, data have known contemporaneous dependencies and there are no causal influences. The equations of the system are the following:
x 1 t = e 1 t x 2 t = e 2 t x 3 t = 0.8 x 2 t + e 3 t x 4 t = 0.7 x 1 t ( x 1 t 2 1 ) e x 1 t 2 / 2 + e 4 t x 5 t = 0.3 x 2 t + 0.05 x 2 t 2 + e 5 t
where e 1 t follows an exponential distribution with rate λ = 2 , e 2 t follow a chi-squared distribution with 1 degree of freedom, e 3 t , e 4 t , e 5 t follow the Gaussian distribution (mean=zero, standard deviation one) and all noise processes are independent to each other. Based on the system’s equations, significant positive linear dependencies exist between the variables X 2 , X 3 , whereas nonlinear ones exist between the variables X 1 , X 4 and X 2 , X 5 (Figure 1a).
Correlation measures are symmetrical; therefore, results are displayed by upper triangular Tables. The PPCor detects the linear correlation between X 2 , X 3 and the nonlinear one between X 2 and X 5 ; however, the complex nonlinear relationship of X 1 and X 4 is detected with a very low percentage over the 100 realizations (Table 3). PSpCor correctly identifies the connectivity network of the system; however, it also indicates the non-direct association of X 3 and X 5 with a percentage of 100 % over the 100 realizations. The pdCor and MI have similar performance with PSpCor, with MI achieving a relatively low percentage of significant correlations for the pair of variables X 1 X 4 ( 31 % ). Since MI is the only bivariate correlation measure considered, it is the only measure expected to indicate the association of X 1 and X 4 . Therefore, although the system is formulated having only contemporaneous dependencies, none of the correlation measures describes the entire connectivity network of the system with complete accuracy.
On the other hand, all the direct causality measures correctly suggest that no causal effects exist among the variables of the first simulation system. RCGCI, PTENEUE, PDC and PCMCI achieve low percentages of significant causal links over the 100 realizations (around the nominal level 5 % ), therefore suggesting that no causal effects exist. Regarding PDC, it gives low percentages of significant effects for all the examined frequencies. No information about contemporaneous relations can be inferred from the causality measures.
Finally, the instantaneous causality measures PMIME0 and PCMCI+ are estimated. Both contemporaneous and lagged effects are extracted and reported in Table 3 for both measures. PMIME0 correctly finds the contemporaneous relationships; however, the percentage of significant relations over the 100 realizations for X 1 X 4 is relatively low ( 42 % ). Due to the estimation procedure of PMIME0, results are approximately symmetrical as long as contemporaneous effects are concerned. Regarding the lagged effects based on PMIME0, percentages of significant links are greater than the nominal level ( 5 % ) at most directions, with the highest one achieving 28 % for X 1 X 3 . Such a performance has been observed previously for PMIME, whereas the detection of non-coupled pairs of variables was observed with percentages larger than the considered nominal [26]; as previously mentioned, PMIME0 is the extension of the causality measure PMIME that infers about both lagged and contemporaneous dependencies. Finally, PCMCI+ correctly identifies the contemporaneous effects with high percentages over the 100 realizations. Further, low percentages for causality are obtained for all pairs of variables. The majority of the estimated percentages are slightly over the nominal level.

5.2. Example 2

In the second example, the causal influences between the variables are known by construction while no contemporaneous influences exist. A nonlinear vector autoregressive (VAR) model of order 3 in five variables is formed. The systems’ equations are given below:
x 1 t = 0.7 x 1 t 1 + e 1 t x 2 t = 0.3 x 1 t 2 2 + e 2 t x 3 t = 0.4 x 1 t 3 0.3 x 3 t 2 + e 3 t x 4 t = 0.7 x 4 t 1 0.3 x 5 t 1 e x 5 t 1 2 / 2 + e 4 t x 5 t = 0.5 x 4 t 1 + 0.2 x 5 t 2 + e 5 t
where e 1 t , e 5 t follow the Gaussian distribution (mean = zero, standard deviation=one), e 2 t follows an exponential distribution with rate λ = 2 , e 3 t follows beta distribution with shape parameters a = 1 and b = 2 , e 4 t follows beta distribution with a = 2 and b = 1 and all noise processes are independent to each other. Based on the system’s equations, there are both nonlinear causal influences, i.e., X 1 X 2 , X 5 X 4 , and linear ones, i.e., X 1 X 3 , X 4 X 5 (Figure 1b).
In the second simulation system, correlation measures seem to be affected by the temporal dependencies and indicate significant contemporaneous dependencies (Table 4). In particular, all the correlation measures indicate the correlated pairs of variables X 1 X 3 and X 4 X 5 , whereas pdCor is also suggesting additional correlated pairs of variables ( X 1 X 2 , X 1 X 4 and X 2 X 3 ). We notice that the suggested correlated pairs of variables ( X 1 X 3 and X 4 X 5 ) coincide with the pairs of variables with linear causal links, i.e., X 1 linearly causes X 3 and X 1 linearly causes X 3 .
The linear causality measures CGCI and RCGCI infer correctly the causal relationships, however the nonlinear link X 1 X 2 achieves a relatively low percentage of significant effects over the 100 realizations ( 20 % and 2024 % , respectively). Separability assumption states that there is unique information about the target variable contained in the driving variable. When this assumption is satisfied, such as in case of linear stochastic systems, Granger causality is effective, whereas deterministic dynamical systems commonly do not satisfy the separability condition. Therefore, the ability of CGCI and RCGCI to detect nonlinear causal effects is related to the nature of the examined system and the satisfaction of the separability assumption.
The nonlinear causality measure PTENUE also correctly indicates the directional linkages; however, the nonlinear effect X 4 X 5 is detected with a percentage of 44 % over the 100 realizations. PDC has the lowest performance among the causality measures. It detects X 1 X 3 , X 4 X 5 and fails to find X 1 X 2 , whereas the spurious causal effects X 4 X 1 ( 37 % ), X 4 X 2 ( 52 % ), X 4 X 3 ( 35 % ) are obtained.
Regarding the instantaneous causality measures, PMIME0 does not identify contemporaneous relations, and suggests the correct causal effects. As noted in the first simulation system, the percentage of significant causal effects for the non-causal links may exceed the nominal level; however, the exported percentages are generally lower compared with those obtained for system 1. PCMCI+ performs worse than PMIME0. It suggests only temporal dependencies; however, large percentages of significant causal effects are noted for non-causal links, where the highest erroneous percentages concern X 2 X 3 ( 41 % ) and X 3 X 1 ( 32 % ); the common input variable X 1 possibly confuses PCMCI+.

5.3. Example 3

Finally, we consider a system with temporal and contemporaneous dependencies, i.e., lagged and zero-lag dependencies. The causal influences between the variables are known by construction. The equations of the considered system are the following ones:
x 1 t = 0.6 x 1 t 2 + e 1 t x 2 t = x 1 t + 0.3 x 2 t 1 + e 2 t x 3 t = 0.3 x 3 t 1 + s i n ( x 2 t 3 ) + e 3 t x 4 t = 0.4 x 3 t 2 + e 4 t x 5 t = 3.2 + 0.5 x 3 t 1 2 + e 5 t
where e 1 t , e 4 t follow the Gaussian distribution (mean = zero, standard deviation = one), e 2 t , e 3 t follow beta distribution with shape parameters a = 1 and b = 2 , e 5 t follows gamma distribution with a = 16 (shape) and b = 0.25 (rate) and all noise processes are independent to each other. Based on the system’s equations, there is a contemporaneous relationship between X 1 and X 2 , the linear causal influence X 3 X 4 and the nonlinear causal effects X 2 X 3 , X 3 X 5 (Figure 1c).
Correlation measures correctly indicate the contemporaneous relation of X 1 and X 2 , however additional relations are suggested for the pairs of variables with causal relations but also for many non causal pairs of variables (Table 5). Therefore, as already noted in the first simulation example, lagged effects affect the performance of the correlation measures and erroneous contemporaneous relationships between the variables are indicated.
Regarding the causality measures, PTENUE suggests the correct causal effects but additionally indicates the causal effect from X 1 to X 2 . Based on the systems’s equations, by substituting x 1 t in the equation of X 2 t , a lagged effect of x 1 t 2 on x 2 t is obtained: x 2 t = ( 0.6 x 1 t 2 + e 1 t ) + 0.3 x 2 t 1 + e 2 t . Therefore, the link X 1 X 2 is not erroneously found by the causality measures; a lagged relation emerges based on the equations of the system. RCGCI has similar performance to PTENUE; however, it also detects the indirect link X 1 X 3 with a low percentage ( 22 % ). PDC has again the worst performance overestimating the coupled pairs of variables.
Such a designed system favors the instantaneous causality measures, since both contemporaneous and causal effects exist. PMIME0 correctly identifies the contemporaneous dependence of X 1 and X 2 and the causal links, whereas also X 2 X 1 is suggested ( 97 % ). As previously noted, moderately high percentages of significant effects are obtained also for non causal pairs of variables reaching 26 % for X 4 X 5 . PCMCI+ correctly infers the contemporaneous and causal dependencies; however, it seems to give high percentages of significant causal links in almost all directions.

6. Conclusions

In this paper, we have presented a brief review of the main connectivity measures currently used to infer the connectivity network of a examined complex system. Connectivity analysis is essential in different applications, such as in finance and neurophysiology. Nondirectional measures indicate the symmetric relationships of variables that evolve in synchrony, whereas directional measures infer the directions of the causal influences. The main limitations of the connectivity measures have been discussed in brief.
When studying the interdependencies of a system, connectivity may be inferred based on correlations or causality. However, selecting the proper methodology is still an open issue since the nature of real systems is in general unknown. This study investigated the efficacy of different connectivity measures for different simulated data, whereas complexity was further increased by considering non-normal noise terms in order to generate samples with skewed or/and non-symmetrical distributions. Simulation experiments were used to demonstrate the performance of the connectivity measures in three different scenarios, i.e., when data exhibit only contemporaneous dependencies, only directional causal effects, and finally both contemporaneous and temporal dependencies.
The main outcomes of the simulation study can be summarized to the following:
(a)
Results suggest the sensitivity of correlation measures when temporal dependencies exist in the data. Correlation measures tend to erroneously indicate contemporaneous relations even though only lagged dependencies exist.
(b)
Causality measures do not spuriously indicate causal effects when data present only contemporaneous dependencies. We should note here that the poor performance of PDC for systems 2 and 3 may be due to the fact that significant PDC values are reported comprehensively for all the examined frequencies. In real applications, usually specific frequency bands are selected according to the types of samples [211,212].
(c)
Instantaneous causality measures handle contemporaneous and causal effects at the same time. Therefore, it seems to be highly promising for analyzing the connectivity structure of real data.
Although both considered instantaneous causality measures seem to have potential and effectively infer the dependencies of most examined systems, they tend to give high percentages of significant causal effects for non-causal pairs of variables. This is a problem that explicitly reduces the effectiveness of the measures. The consideration of different values for the free parameters of the measures, such as the significance level or the number of neighbors for PMIME0, may improve the performance of the measures; however, here, only standard values of free parameters are used at all the examined systems for all causality measures. A possible optimization of the free parameters of the measures is out of the scopes of this work. However, the necessity of an automatic selection of standard free parameters of any connectivity measure in case of real applications should be pointed out.
The indicative simulation study highlights the limitations and advantages of the different connectivity measures. The outcomes of this study are suggesting the superiority of the causality measures over the correlation measures and the instantaneous causality measures. Correlation measures are highly affected by lagged directional relationships. Instantaneous causality measures, although promising, still need to be optimized to be effectively applied. At this point, we should note that correlation measures are effectively utilized long-term and are suitable for specific data types with possibly known topological features or characteristics, e.g., [213,214]. Further, their computation is extremely fast in contrast to the time-consuming estimation of most nonlinear causality measures.
Future studies aim to further investigate the above findings by testing additional scenarios regarding the samples and the nature of the dependencies, i.e., by considering samples with longer memory, samples that exhibit volatility clustering and samples of higher dimensions.

Funding

This project has received funding from the Hellenic Foundation for Research and Innovation (HFRI) and the General Secretariat for Research and Innovation (GSRI), under grant agreement No. 794.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analysis, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

References

  1. Granger, C. Investigating causal relations by econometric models and cross-spectral methods. Econom. J. Econom. Soc. 1969, 37, 424–438. [Google Scholar] [CrossRef]
  2. Scher, M.; Sun, M.; Steppe, D.; Guthrie, R.; Sclabassi, R. Comparisons of EEG spectral and correlation measures between healthy term and preterm infants. Pediatr. Neurol. 1994, 10, 104–108. [Google Scholar] [CrossRef]
  3. Kelly, E.; Lenz, J.; Franaszczuk, P.; Truong, Y. A general statistical framework for frequency-domain analysis of EEG topographic structure. Comput. Biomed. Res. 1997, 30, 129–164. [Google Scholar] [CrossRef] [Green Version]
  4. Precup, O.; Iori, G. A comparison of high-frequency cross-correlation measures. Phys. A Stat. Mech. Its Appl. 2004, 344, 252–256. [Google Scholar] [CrossRef] [Green Version]
  5. Bolboaca, S.D.; Jäntschi, L. Pearson versus Spearman, Kendall’s tau correlation analysis on structure-activity relationships of biologic active compounds. Leonardo J. Sci. 2006, 5, 179–200. [Google Scholar]
  6. Song, L.; Langfelder, P.; Horvath, S. Comparison of co-expression measures: Mutual information, correlation, and model based indices. BMC Bioinform. 2012, 13, 328. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  7. Cutts, C.; Eglen, S. Detecting pairwise correlations in spike trains: An objective comparison of methods and application to the study of retinal waves. J. Neurosci. 2014, 34, 14288–14303. [Google Scholar] [CrossRef] [Green Version]
  8. Gencaga, D.; Malakar, N.; Lary, D. Survey on the estimation of mutual information methods as a measure of dependency versus correlation analysis. AIP Conf. Proc. 2014, 1636, 80–87. [Google Scholar]
  9. Wang, Y.; Li, Y.; Cao, H.; Xiong, M.; Shugart, Y.Y.; Jin, L. Efficient test for nonlinear dependence of two continuous variables. BMC Bioinform. 2015, 16, 260. [Google Scholar] [CrossRef] [Green Version]
  10. Skotarczak, E.; Dobek, A.; Moliński, K. Comparison of some correlation measures for continuous and categorical data. Biom. Lett. 2019, 56, 253–261. [Google Scholar] [CrossRef] [Green Version]
  11. Ombao, H.; Pinto, M. Spectral dependence. arXiv 2021, arXiv:2103.17240. [Google Scholar]
  12. Mormann, F.; Lehnertz, K.; David, P.; Elger, C. Mean phase coherence as a measure for phase synchronization and its application to the EEG of epilepsy patients. Phys. D Nonlinear Phenom. 2000, 144, 358–369. [Google Scholar] [CrossRef]
  13. Quiroga, R.; Kreuz, T.; Grassberger, P. Event synchronization: A simple and fast method to measure synchronicity and time delay patterns. Phys. Rev. E 2002, 66, 041904. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  14. Guevara, R.; Velazquez, J.; Nenadovic, V.; Wennberg, R.; Senjanović, G.; Dominguez, L. Phase synchronization measurements using electroencephalographic recordings. Neuroinformatics 2005, 3, 301–313. [Google Scholar] [CrossRef]
  15. Liang, Z.; Bai, Y.; Ren, Y.; Li, X. Synchronization measures in EEG signals. In Signal Processing in Neuroscience; Springer: Singapore, 2016; pp. 167–202. [Google Scholar]
  16. Yoshinaga, K.; Matsuhashi, M.; Mima, T.; Fukuyama, H.; Takahashi, R.; Hanakawa, T.; Ikeda, A. Comparison of phase synchronization measures for identifying stimulus-induced functional connectivity in human magnetoencephalographic and simulated data. Front. Neurosci. 2020, 14, 648. [Google Scholar] [CrossRef]
  17. Honari, H.; Choe, A.; Lindquist, M. Evaluating phase synchronization methods in fMRI: A comparison study and new approaches. NeuroImage 2021, 228, 117704. [Google Scholar] [CrossRef] [PubMed]
  18. Ostermark, R.; Aaltonen, J. Comparison of univariate and multivariate Granger causality in international asset pricing. Evidence from Finnish and Japanese financial economies. Appl. Financ. Econ. 1999, 9, 155–165. [Google Scholar] [CrossRef]
  19. Nolte, G.; Ziehe, A.; Krämer, N.; Popescu, F.; Müller, K.R. Comparison of Granger causality and phase slope index. Causality Object. Assess. 2010, 6, 267–276. [Google Scholar]
  20. Florin, E.; Gross, J.; Pfeifer, J.; Fink, G.; Timmermann, L. Reliability of multivariate causality measures for neural data. J. Neurosci. Methods 2011, 198, 344–358. [Google Scholar] [CrossRef]
  21. Wu, M.H.; Frye, R.; Zouridakis, G. A comparison of multivariate causality based measures of effective connectivity. Comput. Biol. Med. 2011, 41, 1132–1141. [Google Scholar] [CrossRef]
  22. Fasoula, A.; Attal, Y.; Schwartz, D. Comparative performance evaluation of data-driven causality measures applied to brain networks. J. Neurosci. Methods 2013, 215, 170–189. [Google Scholar] [CrossRef] [PubMed]
  23. Zaremba, A.; Aste, T. Measures of causality in complex datasets with application to financial data. Entropy 2014, 16, 2309–2349. [Google Scholar] [CrossRef] [Green Version]
  24. Siggiridou, E.; Kimiskidis, V.; Kugiumtzis, D. Dimension reduction of frequency-based direct Granger causality measures on short time series. J. Neurosci. Methods 2017, 289, 64–74. [Google Scholar] [CrossRef]
  25. Siggiridou, E.; Koutlis, C.; Tsimpiris, A.; Kugiumtzis, D. Evaluation of Granger causality measures for constructing networks from multivariate time series. Entropy 2019, 21, 1080. [Google Scholar] [CrossRef] [Green Version]
  26. Papana, A.; Siggiridou, E.; Kugiumtzis, D. Detecting direct causality in multivariate time series: A comparative study. Commun. Nonlinear Sci. Numer. Simul. 2021, 99, 105797. [Google Scholar] [CrossRef]
  27. Cartwright, P.; Kamerschen, D.; Huang, M.Y. Price correlation and Granger causality tests for market definition. Rev. Ind. Organ. 1989, 4, 79–98. [Google Scholar] [CrossRef]
  28. Beck, T.; Levine, R. Stock Markets, Banks, and Growth: Correlation or Causality? World Bank Publications: Washington, VA, USA, 2001; Volume 2670. [Google Scholar]
  29. Billio, M.; Getmansky, M.; Lo, A.; Pelizzon, L. Econometric measures of connectedness and systemic risk in the finance and insurance sectors. J. Financ. Econ. 2012, 104, 535–559. [Google Scholar] [CrossRef]
  30. Ateş, E.; Güran, A. Pearson correlation and Granger causality analysis of Twitter sentiments and the daily changes in Bist30 index returns. J. Fac. Eng. Archit. Gazi Univ. 2021, 36, 1687–1701. [Google Scholar]
  31. Kozak, M. What is strong correlation? Teach. Stat. 2009, 31, 85–86. [Google Scholar] [CrossRef]
  32. Aggarwal, R.; Ranganathan, P. Common pitfalls in statistical analysis: The use of correlation techniques. Perspect. Clin. Res. 2016, 7, 187. [Google Scholar]
  33. Armstrong, R. Should Pearson’s correlation coefficient be avoided? Ophthalmic Physiol. Opt. 2019, 39, 316–327. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  34. Saccenti, E.; Hendriks, M.; Smilde, A. Corruption of the Pearson correlation coefficient by measurement error and its estimation, bias, and correction under different error models. Sci. Rep. 2020, 10, 438. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  35. Janse, R.; Hoekstra, T.; Jager, K.; Zoccali, C.; Tripepi, G.; Dekker, F.; van Diepen, M. Conducting correlation analysis: Important limitations and pitfalls. Clin. Kidney J. 2021, 14, 2332–2337. [Google Scholar] [CrossRef] [PubMed]
  36. Nalatore, H.; Ding, M.; Rangarajan, G. Mitigating the effects of measurement noise on Granger causality. Phys. Rev. E 2007, 75, 031123. [Google Scholar] [CrossRef] [Green Version]
  37. Ramb, R.; Eichler, M.; Ing, A.; Thiel, M.; Weiller, C.; Grebogi, C.; Schwarzbauer, C.; Timmer, J.; Schelter, B. The impact of latent confounders in directed network analysis in neuroscience. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2013, 371, 20110612. [Google Scholar] [CrossRef] [Green Version]
  38. Bastos, A.; Schoffelen, J.M. A tutorial review of functional connectivity analysis methods and their interpretational pitfalls. Front. Syst. Neurosci. 2016, 9, 175. [Google Scholar] [CrossRef] [Green Version]
  39. Trongnetrpunya, A.; Nandi, B.; Kang, D.; Kocsis, B.; Schroeder, C.; Ding, M. Assessing granger causality in electrophysiological data: Removing the adverse effects of common signals via bipolar derivations. Front. Syst. Neurosci. 2016, 9, 189. [Google Scholar] [CrossRef] [Green Version]
  40. Antonacci, Y.; Astolfi, L.; Faes, L. Testing different methodologies for Granger causality estimation: A simulation study. In Proceedings of the 28th European Signal Processing Conference (EUSIPCO), Amsterdam, The Netherlands, 24–28 August 2021; pp. 940–944. [Google Scholar]
  41. Koutlis, C.; Kugiumtzis, D. The effect of a hidden source on the estimation of connectivity networks from multivariate time series. Entropy 2021, 23, 208. [Google Scholar] [CrossRef]
  42. Moraffah, R.; Sheth, P.; Karami, M.; Bhattacharya, A.; Wang, Q.; Tahir, A.; Raglin, A.; Liu, H. Causal inference for time series analysis: Problems, methods and evaluation. arXiv 2021, arXiv:2102.05829. [Google Scholar] [CrossRef]
  43. Pearson, K., VII. Note on regression and inheritance in the case of two parents. Proc. R. Soc. Lond. 1895, 58, 240–242. [Google Scholar]
  44. Spearman, C. The proof and measurement of association between two things. J. Psychol. 1094, 15, 72–101. [Google Scholar] [CrossRef]
  45. Kendall, M. Rank Correlation Methods, 2nd ed.; Hafner Publishing Co.: New York, NY, USA, 1955. [Google Scholar]
  46. Hoeffding, W. A non-parametric test of independence. Ann. Math. Stat. 1948, 19, 546–557. [Google Scholar] [CrossRef]
  47. Gaißer, S.; Ruppert, M.; Schmid, F. A multivariate version of Hoeffding’s phi-square. J. Multivar. Anal. 2010, 101, 2571–2586. [Google Scholar] [CrossRef] [Green Version]
  48. Rao, C. Linear Statistical Inference and Its Applications; Wiley: New York, NY, USA, 1973; Volume 2. [Google Scholar]
  49. Székely, G.; Rizzo, M.; Bakirov, N. Measuring and testing dependence by correlation of distances. Ann. Stat. 2007, 35, 2769–2794. [Google Scholar] [CrossRef]
  50. Székely, G.; Rizzo, M. Brownian distance covariance. Ann. Appl. Stat. 2009, 3, 1236–1265. [Google Scholar] [CrossRef] [Green Version]
  51. Székely, G.; Rizzo, M. Partial distance correlation with methods for dissimilarities. Ann. Stat. 2014, 42, 2382–2412. [Google Scholar] [CrossRef]
  52. Yule, G. On the association of attributes in statistics: With illustrations from the material of the childhood society &c. Philos. Trans. R. Soc. Lond. Ser. A 1900, 194, 257–319. [Google Scholar]
  53. Yule, G. On the methods of measuring association between two attributes. J. R. Stat. Soc. 1912, 75, 579–652. [Google Scholar] [CrossRef] [Green Version]
  54. Digby, P. Approximating the tetrachoric correlation coefficient. Biometrics 1983, 39, 753–757. [Google Scholar] [CrossRef]
  55. Bonett, D.; Price, R. Statistical inference for generalized Yule coefficients in 2 × 2 contingency tables. Sociol. Methods Res. 2007, 35, 429–446. [Google Scholar] [CrossRef]
  56. Lopez-Paz, D.; Hennig, P.; Schölkopf, B. The randomized dependence coefficient. Adv. Neural Inf. Process. Syst. 2013, 26, 1–9. [Google Scholar]
  57. Ding, A.; Li, Y. Copula correlation: An equitable dependence measure and extension of Pearson’s correlation. arXiv 2013, arXiv:1312.7214. [Google Scholar]
  58. Wen, F.; Liu, Z. A copula-based correlation measure and its application in Chinese stock market. Int. J. Inf. Technol. Decis. Mak. 2009, 8, 787–801. [Google Scholar] [CrossRef]
  59. Schmid, F.; Schmidt, R.; Blumentritt, T.; Gaißer, S.; Ruppert, M. Copula-based measures of multivariate association. In Copula Theory and Its Applications; Springer: Berlin/Heidelberg, Germany, 2010; pp. 209–236. [Google Scholar]
  60. Kim, J.M.; Jung, Y.S.; Choi, T.; Sungur, E. Partial correlation with copula modeling. Comput. Stat. Data Anal. 2011, 55, 1357–1366. [Google Scholar] [CrossRef]
  61. Póczos, B.; Ghahramani, Z.; Schneider, J. Copula-based kernel dependency measures. arXiv 2012, arXiv:1206.4682. [Google Scholar]
  62. García-Gómez, C.; Pérez, A.; Prieto-Alaiz, M. Copula-based analysis of multivariate dependence patterns between dimensions of poverty in Europe. Rev. Income Wealth 2021, 67, 165–195. [Google Scholar] [CrossRef]
  63. Shih, J.H.; Emura, T. On the copula correlation ratio and its generalization. J. Multivar. Anal. 2021, 182, 104708. [Google Scholar] [CrossRef]
  64. Wang, Q.; Shen, Y.; Zhang, J. A nonlinear correlation measure for multivariable data set. Phys. D Nonlinear Phenom. 2005, 200, 287–295. [Google Scholar] [CrossRef]
  65. Fraser, A.; Swinney, H. Independent coordinates for strange attractors from mutual information. Phys. Rev. A 1986, 33, 1134. [Google Scholar] [CrossRef]
  66. Cover, T.; Thomas, J. Gambling and data compression. In Elements of Information Theory; John Wiley & Sons, Inc.: Hoboken, NJ, USA, 1991; pp. 125–140. [Google Scholar] [CrossRef]
  67. Contreras-Reyes, J.E. Mutual information matrix based on asymmetric Shannon entropy for nonlinear interactions of time series. Nonlinear Dyn. 2021, 104, 3913–3924. [Google Scholar] [CrossRef]
  68. Eshima, N.; Tabata, M. Entropy for measuring predictive power of generalized linear models. Stat. Probab. Lett. 2007, 77, 588–593. [Google Scholar] [CrossRef]
  69. Eshima, N.; Tabata, M. Entropy coefficient of determination for generalized linear models. Comput. Stat. Data Anal. 2010, 54, 1381–1389. [Google Scholar] [CrossRef]
  70. Reshef, D.; Reshef, Y.; Finucane, H.; Grossman, S.; McVean, G.; Turnbaugh, P.; Lander, E.; Mitzenmacher, M.; Sabeti, P. Detecting novel associations in large data sets. Science 2011, 334, 1518–1524. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  71. Qiuheng, T.; Jiang, H.; Yiming, D. Model selection method based on maximal information coefficient of residuals. Acta Math. Sci. 2014, 34, 579–592. [Google Scholar]
  72. Wilmer, A.; de Lussanet, M.; Lappe, M. Time-delayed mutual information of the phase as a measure of functional connectivity. PLoS ONE 2012, 7, e44633. [Google Scholar] [CrossRef] [PubMed]
  73. Nunez, P. Electric Fields of the Brain: The Neurophysics of EEG; Oxford University Press: Oxford, OH, USA, 2006. [Google Scholar]
  74. Bendat, J.; Piersol, A. Random Data; Wiley-Interscience: Hoboken, NJ, USA, 1986. [Google Scholar]
  75. Rosenberg, J.; Halliday, D.; Breeze, P.; Conway, B. Identification of patterns of neuronal connectivity—Partial spectra, partial coherence, and neuronal interactions. J. Neurosci. Methods 1998, 83, 57–72. [Google Scholar] [CrossRef]
  76. Makhtar, S.; Halliday, D.; Senik, M.; Mason, R. Multivariate partial coherence analysis for identification of neuronal connectivity from multiple electrode array recordings. In Proceedings of the IEEE Conference on Biomedical Engineering and Sciences (IECBES), Kuala Lumpur, Malaysia, 8–10 December 2014; pp. 77–82. [Google Scholar]
  77. Rosenblum, M.; Pikovsky, A.; Kurths, J. Phase synchronization of chaotic oscillators. Phys. Rev. Lett. 1996, 76, 1804. [Google Scholar] [CrossRef]
  78. Lachaux, J.P.; Rodriguez, E.; Martinerie, J.; Varela, F. Measuring phase synchrony in brain signals. Hum. Brain Mapp. 1999, 8, 194–208. [Google Scholar] [CrossRef] [Green Version]
  79. Kuramoto, Y. Cooperative dynamics of oscillator communitya study based on lattice of rings. Prog. Theor. Phys. Suppl. 1984, 79, 223–240. [Google Scholar] [CrossRef]
  80. Eckmann, J. Recurrence plots of dynamical systems. Europhys. Lett. 1987, 5, 973–977. [Google Scholar] [CrossRef] [Green Version]
  81. Zbilut, J.; Giuliani, A.; Webber, C., Jr. Detecting deterministic signals in exceptionally noisy environments using cross-recurrence quantification. Phys. Lett. A 1998, 246, 122–128. [Google Scholar] [CrossRef]
  82. Marwan, N.; Kurths, J. Nonlinear analysis of bivariate data with cross recurrence plots. Phys. Lett. A 2002, 302, 299–307. [Google Scholar] [CrossRef] [Green Version]
  83. Zbilut, J.; Webber, C., Jr. Embeddings and delays as derived from quantification of recurrence plots. Phys. Lett. A 1992, 171, 199–203. [Google Scholar] [CrossRef]
  84. Webber, C., Jr.; Zbilut, J. Dynamical assessment of physiological systems and states using recurrence plot strategies. J. Appl. Physiol. 1994, 76, 965–973. [Google Scholar] [CrossRef]
  85. Dobrushin, R. General formulation of Shannon’s main theorem in information theory. Am. Math. Soc. Transl. 1963, 33, 323–438. [Google Scholar]
  86. Wyner, A. A definition of conditional mutual information for arbitrary ensembles. Inf. Control. 1978, 38, 51–59. [Google Scholar] [CrossRef] [Green Version]
  87. Cover, T.; Thomas, J. Elements of Information Theory, 2nd ed.; John Wiley & Sons: Hoboken, NJ, USA, 2012. [Google Scholar]
  88. Wilcox, R. Introduction to Robust Estimation and Hypothesis Testing, 3rd ed.; Academic Press: Cambridge, MA, USA, 2011. [Google Scholar]
  89. De La Fuente, A.; Bing, N.; Hoeschele, I.; Mendes, P. Discovery of meaningful associations in genomic data using partial correlation coefficients. Bioinformatics 2004, 20, 3565–3574. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  90. Fujita, A.; Sato, J.R.; Demasi, M.; Sogayar, M.; Ferreira, C.; Miyano, S. Comparing Pearson, Spearman and Hoeffding’s D measure for gene expression association analysis. J. Bioinform. Comput. Biol. 2009, 7, 663–684. [Google Scholar] [CrossRef]
  91. Kobayashi, H.; Ujike, H.; Hasegawa, J.; Yamamoto, M.; Kanzaki, A.; Sora, I. Correlation of tau gene polymorphism with age at onset of Parkinson’s disease. Neurosci. Lett. 2006, 405, 202–206. [Google Scholar] [CrossRef] [PubMed]
  92. Sleeman, R.; Van Wettum, A.; Trampert, J. Three-channel correlation analysis: A new technique to measure instrumental noise of digitizers and seismic sensors. Bull. Seismol. Soc. Am. 2006, 96, 258–271. [Google Scholar] [CrossRef]
  93. Benesty, J.; Chen, J.; Huang, Y. On the importance of the Pearson correlation coefficient in noise reduction. IEEE Trans. Audio Speech Lang. Process. 2008, 16, 757–765. [Google Scholar] [CrossRef]
  94. Huang, H.C.; Zheng, S.; Zhao, Z. Application of Pearson correlation coefficient (PCC) and Kolmogorov-Smirnov distance (KSD) metrics to identify disease-specific biomarker genes. BMC Bioinform. 2010, 11, P23. [Google Scholar] [CrossRef] [Green Version]
  95. Cahill, N. Normalized measures of mutual information with general definitions of entropy for multimodal image registration. In International Workshop on Biomedical Image Registration; Springer: Berlin/Heidelberg, Germany, 2010; pp. 258–268. [Google Scholar]
  96. Edirisinghe, C.; Zhou, W. Portfolio optimization using rank correlation. In Encyclopedia of Business Analytics and Optimization; IGI Global: Hershey, PA, USA, 2014; pp. 1866–1879. [Google Scholar]
  97. Haluszczynski, A.; Laut, I.; Modest, H.; Räth, C. Linear and nonlinear market correlations: Characterizing financial crises and portfolio optimization. Phys. Rev. E 2017, 96, 062315. [Google Scholar] [CrossRef] [Green Version]
  98. Tai-Yuen, H. Rank correlation analysis of investment decision for small investors in the Hong Kong derivatives markets. J. Econ. Bibliogr. 2015, 2, 106–116. [Google Scholar]
  99. Che, Y.; Jia, Y.; Tang, Z.; Lan, F. Application of Pearson correlation coefficient in wind power combination prediction. Guangxi Electr. Power 2016, 3, 50–53. [Google Scholar]
  100. Roverato, A.; Castelo, R. The networked partial correlation and its application to the analysis of genetic interactions. J. R. Stat. Soc. Ser. C Appl. Stat. 2017, 66, 647–665. [Google Scholar] [CrossRef] [Green Version]
  101. Jayaweera, C.; Aziz, N. Reliability of principal component analysis and Pearson correlation coefficient, for application in artificial neural network model development, for water treatment plants. IOP Conf. Ser. Mater. Sci. Eng. 2018, 458, 012076. [Google Scholar] [CrossRef]
  102. Podhorodecka, K. Tourism economies and islands’ resilience to the global financial crisis. Isl. Stud. J. 2018, 13, 163–184. [Google Scholar] [CrossRef]
  103. Farahmand, S.; Sobayo, T.; Mogul, D. Noise-assisted multivariate EMD-based mean-phase coherence analysis to evaluate phase-synchrony dynamics in epilepsy patients. IEEE Trans. Neural Syst. Rehabil. Eng. 2018, 26, 2270–2279. [Google Scholar] [CrossRef]
  104. Fiedor, P. Networks in financial markets based on the mutual information rate. Phys. Rev. E 2014, 89, 052801. [Google Scholar] [CrossRef] [Green Version]
  105. Millington, T.; Niranjan, M. Quantifying influence in financial markets via partial correlation network inference. In Proceedings of the 11th IEEE International Symposium on Image and Signal Processing and Analysis (ISPA), Dubrovnik, Croatia, 23–25 September 2019; pp. 306–311. [Google Scholar]
  106. Zhao, G.; Zhang, Y.; Zhang, G.; Zhang, D.; Liu, Y.J. Multi-target positive emotion recognition from EEG signals. IEEE Trans. Affect. Comput. 2020, 99, 1949–3045. [Google Scholar] [CrossRef]
  107. Kumar, G.; Kumar, R. A correlation study between meteorological parameters and COVID-19 pandemic in Mumbai, India. Diabetes Metab. Syndr. Clin. Res. Rev. 2020, 14, 1735–1742. [Google Scholar] [CrossRef]
  108. Maguluri, L.; Ragupathy, R. An efficient stock market trend prediction using the real-time stock technical data and stock social media data. Int. J. Intell. Eng. Syst. 2020, 13, 316–332. [Google Scholar] [CrossRef]
  109. Thakkar, A.; Patel, D.; Shah, P. Pearson correlation coefficient-based performance enhancement of Vanilla neural network for stock trend prediction. Neural Comput. Appl. 2021, 33, 16985–17000. [Google Scholar] [CrossRef]
  110. Wiener, N. What is information theory. IRE Trans. Inf. Theory 1956, 2, 48. [Google Scholar] [CrossRef]
  111. Geweke, J. Measurement of linear dependence and feedback between multiple time series. J. Am. Stat. Assoc. 1982, 77, 304–313. [Google Scholar] [CrossRef]
  112. Guo, S.; Seth, A.; Kendrick, K.; Zhou, C.; Feng, J. Partial Granger causality—Eliminating exogenous inputs and latent variables. J. Neurosci. Methods 2008, 172, 79–93. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  113. Ancona, N.; Marinazzo, D.; Stramaglia, S. Radial basis function approach to nonlinear Granger causality of time series. Phys. Rev. E 2004, 70, 056221. [Google Scholar] [CrossRef] [Green Version]
  114. Marinazzo, D.; Pellicoro, M.; Stramaglia, S. Kernel-Granger causality and the analysis of dynamical networks. Phys. Rev. E 2008, 77, 056215. [Google Scholar] [CrossRef] [Green Version]
  115. Zhao, Y.; Billings, S.; Wei, H.; He, F.; Sarrigiannis, P. A new NARX-based Granger linear and nonlinear casual influence detection method with applications to EEG data. J. Neurosci. Methods 2013, 212, 79–86. [Google Scholar] [CrossRef]
  116. Baek, E.; Brock, W. A General Test for Granger Causality: Bivariate Model; Working Paper; Iowa State University and University of Wisconsin: Madison, WI, USA, 1992. [Google Scholar]
  117. Hiemstra, C.; Jones, J. Testing for linear and nonlinear Granger causality in the stock price-volume relation. J. Financ. 1994, 49, 1639–1664. [Google Scholar]
  118. Diks, C.; Panchenko, V. A new statistic and practical guidelines for nonparametric Granger causality testing. J. Econ. Dyn. Control. 2006, 30, 1647–1669. [Google Scholar] [CrossRef] [Green Version]
  119. Bai, Z.; Wong, W.K.; Zhang, B. Multivariate linear and nonlinear causality tests. Math. Comput. Simul. 2010, 81, 5–17. [Google Scholar] [CrossRef] [Green Version]
  120. Schreiber, T. Measuring information transfer. Phys. Rev. Lett. 2000, 85, 461. [Google Scholar] [CrossRef] [Green Version]
  121. Vakorin, V.; Krakovska, O.; McIntosh, A. Confounding effects of indirect connections on causality estimation. J. Neurosci. Methods 2009, 184, 152–160. [Google Scholar] [CrossRef] [PubMed]
  122. Papana, A.; Kugiumtzis, D.; Larsson, P. Detection of direct causal effects and application to epileptic electroencephalogram analysis. Int. J. Bifurc. Chaos 2012, 22, 1250222. [Google Scholar] [CrossRef]
  123. Montalto, A.; Faes, L.; Marinazzo, D. MuTE: A MATLAB toolbox to compare established and novel estimators of the multivariate transfer entropy. PLoS ONE 2014, 9, e109462. [Google Scholar]
  124. Vlachos, I.; Kugiumtzis, D. Nonuniform state-space reconstruction and coupling detection. Phys. Rev. E 2010, 82, 016207. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  125. Kugiumtzis, D. Direct-coupling information measure from nonuniform embedding. Phys. Rev. E 2013, 87, 062918. [Google Scholar] [CrossRef] [Green Version]
  126. Zhang, J. Low-dimensional approximation searching strategy for transfer entropy from non-uniform embedding. PLoS ONE 2018, 13, e0194382. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  127. Jia, Z.; Lin, Y.; Jiao, Z.; Ma, Y.; Wang, J. Detecting causality in multivariate time series via non-uniform embedding. Entropy 2019, 21, 1233. [Google Scholar] [CrossRef] [Green Version]
  128. Quiroga, R.; Kraskov, A.; Kreuz, T.; Grassberger, P. Performance of different synchronization measures in real data: A case study on electroencephalographic signals. Phys. Rev. E 2002, 65, 041903. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  129. Arnhold, J.; Grassberger, P.; Lehnertz, K.; Elger, C. A robust method for detecting interdependences: Application to intracranially recorded EEG. Phys. D Nonlinear Phenom. 1999, 134, 419–430. [Google Scholar] [CrossRef] [Green Version]
  130. Quian Quiroga, R.; Arnhold, J.; Grassberger, P. Learning driver-response relationships from synchronization patterns. Phys. Rev. E 2000, 61, 5142. [Google Scholar] [CrossRef] [Green Version]
  131. Breakspear, M.; Terry, J. Topographic organization of nonlinear interdependence in multichannel human EEG. NeuroImage 2002, 16, 822–835. [Google Scholar] [CrossRef] [PubMed]
  132. Breakspear, M.; Terry, J. Nonlinear interdependence in neural systems: Motivation, theory, and relevance. Int. J. Neurosci. 2002, 112, 1263–1284. [Google Scholar] [CrossRef]
  133. Andrzejak, R.; Kraskov, A.; Stögbauer, H.; Mormann, F.; Kreuz, T. Bivariate surrogate techniques: Necessity, strengths, and caveats. Phys. Rev. E 2003, 68, 066202. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  134. Bhattacharya, J.; Pereda, E.; Petsche, H. Effective detection of coupling in short and noisy bivariate data. IEEE Trans. Syst. Man Cybern. Part B Cybern. 2003, 33, 85–95. [Google Scholar] [CrossRef]
  135. Chen, Y.; Rangarajan, G.; Feng, J.; Ding, M. Analyzing multiple nonlinear time series with extended Granger causality. Phys. Lett. A 2004, 324, 26–35. [Google Scholar] [CrossRef] [Green Version]
  136. Sugihara, G.; May, R.; Ye, H.; Hsieh, C.H.; Deyle, E.; Fogarty, M.; Munch, S. Detecting causality in complex ecosystems. Science 2012, 338, 496–500. [Google Scholar] [CrossRef]
  137. Pearl, J. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference; Morgan Kaufmann: Burlington, MA, USA, 1988. [Google Scholar]
  138. Spirtes, P. Detecting causal relations in the presence of unmeasured variables. In Uncertainty Proceedings 1991; Elsevier: Amsterdam, The Netherlands, 1991; pp. 392–397. [Google Scholar]
  139. Spirtes, P.; Glymour, C.; Schienes, R. Causation Prediction and Search: Springer Lecture Notes in Statistics, 1st ed.; Springer: Berlin, Germany, 1993. [Google Scholar]
  140. Spirtes, P.; Glymour, C.; Scheines, R.; Heckerman, D. Causation, Prediction, and Search; MIT Press: Cambridge, MA, USA, 2000. [Google Scholar]
  141. Pearl, J. Causality: Models, Reasoning and Inference; Cambridge University Press: Cambridge, UK, 2000; p. 400. [Google Scholar]
  142. Shimizu, S.; Hoyer, P.; Hyvärinen, A.; Kerminen, A.; Jordan, M. A linear non-Gaussian acyclic model for causal discovery. J. Mach. Learn. Res. 2006, 7, 2003–2030. [Google Scholar]
  143. Hoyer, P.; Janzing, D.; Mooij, J.; Peters, J.; Schölkopf, B. Nonlinear causal discovery with additive noise models. In Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 8–11 December 2008; Volume 21, pp. 689–696. [Google Scholar]
  144. Koller, D.; Friedman, N. Probabilistic Graphical Models: Principles and Techniques, 1st ed.; MIT Press: Cambridge, MA, USA, 2009. [Google Scholar]
  145. Entner, D.; Hoyer, P. On causal discovery from time series data using FCI. In Probabilistic Graphical Models; MIT Press: Cambridge, MA, USA, 2010; pp. 121–128. [Google Scholar]
  146. Runge, J.; Nowack, P.; Kretschmer, M.; Flaxman, S.; Sejdinovic, D. Detecting and quantifying causal associations in large nonlinear time series datasets. Sci. Adv. 2019, 5, eaau4996. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  147. Kořenek, J.; Hlinka, J. Causal network discovery by iterative conditioning: Comparison of algorithms. Chaos Interdiscip. J. Nonlinear Sci. 2020, 30, 013117. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  148. Kaminski, M.; Blinowska, K. A new method of the description of the information flow in the brain structures. Biol. Cybern. 1991, 65, 203–210. [Google Scholar] [CrossRef]
  149. Baccalá, L.; Sameshima, K. Partial directed coherence: A new concept in neural structure determination. Biol. Cybern. 2001, 84, 463–474. [Google Scholar] [CrossRef]
  150. Korzeniewska, A.; Mańczak, M.; Kamiński, M.; Blinowska, K.; Kasicki, S. Determination of information flow direction among brain structures by a modified directed transfer function (dDTF) method. J. Neurosci. Methods 2003, 125, 195–207. [Google Scholar] [CrossRef]
  151. Baccala, L.; Sameshima, K.; Takahashi, D. Generalized partial directed coherence. In Proceedings of the 15th International Conference on Digital Signal Processing, Cardiff, UK, 1–4 July 2007; pp. 163–166. [Google Scholar]
  152. Nolte, G.; Ziehe, A.; Nikulin, V.; Schlögl, A.; Krämer, N.; Brismar, T.; Müller, K.R. Robustly estimating the flow direction of information in complex physical systems. Phys. Rev. Lett. 2008, 100, 234101. [Google Scholar] [CrossRef] [Green Version]
  153. Breitung, J.; Candelon, B. Testing for short-and long-run causality: A frequency-domain approach. J. Econom. 2006, 132, 363–378. [Google Scholar] [CrossRef]
  154. Dhamala, M.; Rangarajan, G.; Ding, M. Analyzing information flow in brain networks with nonparametric Granger causality. NeuroImage 2008, 41, 354–362. [Google Scholar] [CrossRef] [Green Version]
  155. Jachan, M.; Henschel, K.; Nawrath, J.; Schad, A.; Timmer, J.; Schelter, B. Inferring direct directed-information flow from multivariate nonlinear time series. Phys. Rev. E 2009, 80, 011138. [Google Scholar] [CrossRef] [Green Version]
  156. Omidvarnia, A.; Mesbah, M.; Khlif, M.; O’Toole, J.; Colditz, P.; Boashash, B. Kalman filter-based time-varying cortical connectivity analysis of newborn EEG. In Proceedings of the 2011 Annual International Conference of the IEEE Engineering in Medicine and Biology Society, Boston, MA, USA, 30 August–3 September 2011; pp. 1423–1426. [Google Scholar]
  157. He, F.; Billings, S.; Wei, H.L.; Sarrigiannis, P. A nonlinear causality measure in the frequency domain: Nonlinear partial directed coherence with applications to EEG. J. Neurosci. Methods 2014, 225, 71–80. [Google Scholar] [CrossRef]
  158. Lütkepohl, H. New Introduction to Multiple Time Series Analysis; Springer Science & Business Media: Cham, Switzerland, 2005. [Google Scholar]
  159. Schiatti, L.; Nollo, G.; Rossato, G.; Faes, L. Extended Granger causality: A new tool to identify the structure of physiological networks. Physiol. Meas. 2015, 36, 827. [Google Scholar] [CrossRef]
  160. Gianetto, Q.; Raïssi, H. Testing instantaneous causality in presence of nonconstant unconditional covariance. J. Bus. Econ. Stat. 2015, 33, 46–53. [Google Scholar] [CrossRef]
  161. Hyvärinen, A.; Zhang, K.; Shimizu, S.; Hoyer, P. Estimation of a structural vector autoregression model using non-gaussianity. J. Mach. Learn. Res. 2010, 11, 1709–1731. [Google Scholar]
  162. Peters, J.; Janzing, D.; Schölkopf, B. Causal inference on time series using restricted structural equation models. In Proceedings of the 26th International Conference on Neural Information Processing Systems, Lake Tahoe, NV, USA, 5–10 December 2013; pp. 154–162. [Google Scholar]
  163. Lopez-Paz, D.; Muandet, K.; Schölkopf, B.; Tolstikhin, I. Towards a learning theory of cause-effect inference. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 1452–1461. [Google Scholar]
  164. Peters, J.; Janzing, D.; Schölkopf, B. Elements of Causal Inference: Foundations and Learning Algorithms; The MIT Press: Cambridge, MA, USA, 2017. [Google Scholar]
  165. Faes, L.; Erla, S.; Porta, A.; Nollo, G. A framework for assessing frequency domain causality in physiological time series with instantaneous effects. Philos. Trans. R. Soc. A Math. Phys. Eng. Sci. 2013, 371, 20110618. [Google Scholar] [CrossRef] [Green Version]
  166. Baccalá, L.; Sameshima, K. Frequency domain repercussions of instantaneous Granger causality. Entropy 2021, 23, 1037. [Google Scholar] [CrossRef]
  167. Rodrigues, J.; Andrade, A. Instantaneous Granger causality with the Hilbert-Huang transform. Int. Sch. Res. Not. 2013, 2013, 374064. [Google Scholar] [CrossRef] [Green Version]
  168. Faes, L.; Nollo, G. Extended causal modeling to assess partial directed coherence in multiple time series with significant instantaneous interactions. Biol. Cybern. 2010, 103, 387–400. [Google Scholar] [CrossRef] [PubMed]
  169. Faes, L.; Nollo, G.; Porta, A. Compensated transfer entropy as a tool for reliably estimating information transfer in physiological time series. Entropy 2013, 15, 198–219. [Google Scholar] [CrossRef] [Green Version]
  170. Gao, W.; Cui, W.; Ye, W. Directed information graphs for the Granger causality of multivariate time series. Phys. A Stat. Mech. Its Appl. 2017, 486, 701–710. [Google Scholar] [CrossRef]
  171. Koutlis, C.; Kimiskidis, V.; Kugiumtzis, D. Identification of hidden sources by estimating instantaneous causality in high-dimensional biomedical time series. Int. J. Neural Syst. 2019, 29, 1850051. [Google Scholar] [CrossRef]
  172. Runge, J. Discovering contemporaneous and lagged causal relations in autocorrelated nonlinear time series datasets. In Proceeding of the Conference on Uncertainty in Artificial Intelligence, Toronto, ON, Canada, 3–6 August 2020; pp. 1388–1397. [Google Scholar]
  173. Gow, D., Jr.; Segawa, J.; Ahlfors, S.; Lin, F.H. Lexical influences on speech perception: A Granger causality analysis of MEG and EEG source estimates. NeuroImage 2008, 43, 614–623. [Google Scholar] [CrossRef] [Green Version]
  174. Seth, A.; Barrett, A.; Barnett, L. Granger causality analysis in neuroscience and neuroimaging. J. Neurosci. 2015, 35, 3293–3297. [Google Scholar] [CrossRef] [PubMed]
  175. He, J.; Shang, P. Comparison of transfer entropy methods for financial time series. Phys. A Stat. Mech. Appl. 2017, 482, 772–785. [Google Scholar] [CrossRef]
  176. Tang, Y.; Xiong, J.; Luo, Y.; Zhang, Y.C. How do the global stock markets Influence one another? Evidence from finance big data and granger causality directed network. Int. J. Electron. Commer. 2019, 23, 85–109. [Google Scholar] [CrossRef] [Green Version]
  177. Basti, A.; Pizzella, V.; Chella, F.; Romani, G.; Nolte, G.; Marzetti, L. Disclosing large-scale directed functional connections in MEG with the multivariate phase slope index. NeuroImage 2018, 175, 161–175. [Google Scholar] [CrossRef]
  178. Meng, Q.; Zhang, Y. Discovery of spatial-temporal causal interactions between thermal and methane anomalies associated with the Wenchuan earthquake. Eur. Phys. J. Spec. Top. 2021, 230, 247–261. [Google Scholar] [CrossRef]
  179. Qu, Y.; Montzka, C.; Vereecken, H. Causation discovery of weather and vegetation condition on global wildfire using the PCMCI Approach. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Brussels, Belgium, 11–16 July 2021; pp. 8644–8647. [Google Scholar]
  180. Zhu, J.; Sun, C.; Li, V. An extended spatio-temporal Granger causality model for air quality estimation with heterogeneous urban big data. IEEE Trans. Big Data 2017, 3, 307–319. [Google Scholar] [CrossRef]
  181. Li, M.; Li, J.; Wan, S.; Chen, H.; Liu, C. Causal identification based on compressive sensing of air pollutants using urban big data. IEEE Access 2020, 8, 109207–109216. [Google Scholar] [CrossRef]
  182. Erten, E.; Lizier, J.; Piraveenan, M.; Prokopenko, M. Criticality and information dynamics in epidemiological models. Entropy 2017, 19, 194. [Google Scholar] [CrossRef]
  183. Reich, B.; Yang, S.; Guan, Y.; Giffin, A.; Miller, M.; Rappold, A. A review of spatial causal inference methods for environmental and epidemiological applications. Int. Stat. Rev. 2021, 89, 605–634. [Google Scholar] [CrossRef]
  184. Embrechts, P.; McNeil, A.; Straumann, D. Correlation and dependence in risk management: Properties and pitfalls. Risk Manag. Value Risk Beyond 2002, 1, 176–223. [Google Scholar]
  185. Faes, L.; Erla, S.; Nollo, G. Measuring connectivity in linear multivariate processes: Definitions, interpretation, and practical analysis. Comput. Math. Methods Med. 2012, 2012, 140513. [Google Scholar] [CrossRef] [PubMed]
  186. Florin, E.; Pfeifer, J. Statistical pitfalls in the comparison of multivariate causality measures for effective causality. Comput. Biol. Med. 2013, 43, 131–134. [Google Scholar] [CrossRef] [PubMed]
  187. Frye, R. A lack of statistical pitfalls in the comparison of multivariate causality measures for effective causality. Comput. Biol. Med. 2013, 43, 962–965. [Google Scholar] [CrossRef]
  188. Stokes, P.; Purdon, P. A study of problems encountered in Granger causality analysis from a neuroscience perspective. Proc. Natl. Acad. Sci. USA 2017, 114, E7063–E7072. [Google Scholar] [CrossRef] [Green Version]
  189. Faes, L.; Stramaglia, S.; Marinazzo, D. On the interpretability and computational reliability of frequency-domain Granger causality. F1000Research 2017, 6, 1710. [Google Scholar] [CrossRef]
  190. Tennant, P.; Arnold, K.; Berrie, L.; Ellison, G.; Gilthorpe, M. Advanced modelling strategies: Challenges and pitfalls in robust causal inference with observational data. In Advanced Modelling Strategies: Challenges and Pitfalls in Robust Causal Inference with Observational Data; Leeds Institute for Data Analytics: Leeds, UK, 2017. [Google Scholar]
  191. Kozak, M. Teaching statistics = teaching thinking statistically. Model Assist. Stat. Appl. 2009, 4, 275–279. [Google Scholar] [CrossRef]
  192. Papana, A.; Kyrtsou, C.; Kugiumtzis, D.; Diks, C. Detecting causality in non-stationary time series using partial symbolic transfer entropy: Evidence in financial data. Comput. Econ. 2016, 47, 341–365. [Google Scholar] [CrossRef]
  193. Kozak, M. Online platform supporting teaching correlation. Model Assist. Stat. Appl. 2011, 6, 71–74. [Google Scholar] [CrossRef]
  194. Faul, F.; Erdfelder, E.; Buchner, A.; Lang, A.G. Statistical power analyses using G*Power 3.1: Tests for correlation and regression analyses. Behav. Res. Methods 2009, 41, 1149–1160. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  195. Bujang, M.; Baharum, N. Sample size guideline for correlation analysis. World 2016, 3, 37–46. [Google Scholar] [CrossRef]
  196. Ramos, A.; Macau, E. Minimum sample size for reliable causal inference using transfer entropy. Entropy 2017, 19, 150. [Google Scholar] [CrossRef] [Green Version]
  197. Fremeth, A.; Holburn, G.; Richter, B. Making Causal Inferences in Small Samples Using Synthetic Control Methodology: Did Chrysler Benefit from Government Assistance? 2013. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2135294 (accessed on 23 October 2021).
  198. Zhang, B.G.; Li, W.; Shi, Y.; Liu, X.; Chen, L. Detecting causality from short time-series data based on prediction of topologically equivalent attractors. BMC Syst. Biol. 2017, 11, 141–150. [Google Scholar] [CrossRef] [Green Version]
  199. Helske, J.; Tikka, S.; Karvanen, J. Estimation of causal effects with small data in the presence of trapdoor variables. J. R. Stat. Soc. Ser. A Stat. Soc. 2021, 184, 1030–1051. [Google Scholar] [CrossRef]
  200. Yu, Z.; Liu, J.; Noda, I. Effect of noise on the evaluation of correlation coefficients in two-dimensional correlation spectroscopy. Appl. Spectrosc. 2003, 57, 1605–1609. [Google Scholar] [CrossRef]
  201. Khan, S.; Bandyopadhyay, S.; Ganguly, A.; Saigal, S.; Erickson, D., III; Protopopescu, V.; Ostrouchov, G. Relative performance of mutual information estimation methods for quantifying the dependence among short and noisy data. Phys. Rev. E 2007, 76, 026209. [Google Scholar] [CrossRef] [Green Version]
  202. Hassani, H.; Dionisio, A.; Ghodsi, M. The effect of noise reduction in measuring the linear and nonlinear dependency of financial markets. Nonlinear Anal. Real World Appl. 2010, 11, 492–502. [Google Scholar] [CrossRef]
  203. Pillow, J.; Latham, P. Neural characterization in partially observed populations of spiking neurons. Adv. Neural Inf. Process. Syst. 2008, 20, 1–8. [Google Scholar]
  204. Vidne, M.; Ahmadian, Y.; Shlens, J.; Pillow, J.; Kulkarni, J.; Litke, A.; Chichilnisky, E.; Simoncelli, E.; Paninski, L. Modeling the impact of common noise inputs on the network activity of retinal ganglion cells. J. Comput. Neurosci. 2012, 33, 97–121. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  205. Eichler, M. Causal inference with multiple time series: Principles and problems. PHilosophical Trans. R. Soc. A Math. Phys. Eng. Sci. 2013, 371, 20110613. [Google Scholar] [CrossRef] [Green Version]
  206. Bachschmid-Romano, L.; Opper, M. Inferring hidden states in a random kinetic Ising model: Replica analysis. J. Stat. Mech. Theory Exp. 2014, 2014, P06013. [Google Scholar] [CrossRef] [Green Version]
  207. Geiger, P.; Zhang, K.; Schoelkopf, B.; Gong, M.; Janzing, D. Causal inference by identification of vector autoregressive processes with hidden components. In Proceeding of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 1917–1925. [Google Scholar]
  208. Runge, J. Causal network reconstruction from time series: From theoretical assumptions to practical estimation. Chaos Interdiscip. J. Nonlinear Sci. 2018, 28, 075310. [Google Scholar] [CrossRef] [PubMed]
  209. Rizzo, M.; Szekely, G. E-Statistics: Multivariate Inference via the Energy of Data. 2021. Available online: https://github.com/mariarizzo/energy (accessed on 23 October 2021).
  210. Kraskov, A.; Stögbauer, H.; Grassberger, P. Estimating mutual information. Phys. Rev. E 2004, 69, 066138. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  211. Varotto, G.; Visani, E.; Canafoglia, L.; Franceschetti, S.; Avanzini, G.; Panzica, F. Enhanced frontocentral EEG connectivity in photosensitive generalized epilepsies: A partial directed coherence study. Epilepsia 2012, 53, 359–367. [Google Scholar] [CrossRef]
  212. Zhao, Z.; Wang, C. Using partial directed coherence to study alpha-band effective brain networks during a visuospatial attention task. Behav. Neurol. 2019, 2019, 1410425. [Google Scholar] [CrossRef] [PubMed]
  213. Letson, D.; McCullough, B. ENSO and soybean prices: Correlation without causality. J. Agric. Appl. Econ. 2001, 33, 513–521. [Google Scholar] [CrossRef] [Green Version]
  214. Roy-García, I.; Rivas-Ruiz, R.; Pérez-Rodríguez, M.; Palacios-Cruz, L. Correlation: Not all correlation entails causality. Rev. Alerg. Mex. 2019, 66, 354–360. [Google Scholar]
Figure 1. The true causal network of (a) S1, (b) S2, (c) S3. Dotted lines denote contemporaneous dependencies and directed arrows denote temporal (causal) dependencies.
Figure 1. The true causal network of (a) S1, (b) S2, (c) S3. Dotted lines denote contemporaneous dependencies and directed arrows denote temporal (causal) dependencies.
Entropy 23 01570 g001
Table 1. Non-directional connectivity measures.
Table 1. Non-directional connectivity measures.
MeasureReference
Pearson product-moment correlation coefficient[43]
Spearman rank correlation coefficient[44]
Kendall’s rank correlation coefficient[45]
Hoeffding’s test of independence[46]
Biweight midcorrelation[88]
Coefficient of determination[48]
Distance correlation[49,50]
Partial distance correlation[51]
Yule’s Q[52]
Yule’s Y[53]
CANOVA[9]
Randomized Dependence Coefficient[56]
Mutual information[65,66,67]
Nonlinear correlation information entropy[64]
Entropy correlation coefficient[68]
Entropy coefficient of determination[69]
Maximal information coefficient[70]
Partial maximal information coefficient[71]
Coherence[73]
Mean phase coherence[12,79]
Phase locking value[12,78]
Determinism[83,84]
Table 2. Directional connectivity measures.
Table 2. Directional connectivity measures.
MeasureReference
Granger causality[1]
Conditional Granger causality[111]
Partial Granger causality[112]
Granger causality on radial basis functions[113]
Granger causality on kernel functions[114]
Granger causality on nonlinear autoregressive exogenous models[115]
Baek and Brok test[116]
Hiemstra and Jones test[117]
Diks and Panchenko test[118]
Nonlinear multivariate causality test of Hiemstra and Jones[119]
Transfer entropy[120]
Partial transfer entropy[121,122]
Partial transfer entropy with nonuniform embedding[123]
Mutual information on mixed embedding[124]
Partial mutual information on mixed embedding[125]
Low-dimensional approximation of transfer entropy[126,127]
Nonlinear interdependence measures[129,130,131,132,133,134]
(Conditional) extended Granger causality[135]
PC algorithm[138]
Fast Causal Inference[140]
tsFCI[145]
PCMCI[146]
Geweke’s spectral Granger causality[111]
Directed transfer function[148]
Partial directed coherence[149]
Direct directed transfer function[150]
Generalized partial directed coherence[151]
Phase Slope Index[152]
Nonparametric partial directed coherence[155]
DEKF-based Partial directed coherence[156]
Nonlinear partial directed coherence[157]
Extended Granger causality[159]
Compensated transfer entropy[168,169]
PMIME0[171]
PCMCI+[172]
Table 3. Percentage of significant correlations based on selected connectivity measures from 100 realizations of system 1. Rows drive the columns.
Table 3. Percentage of significant correlations based on selected connectivity measures from 100 realizations of system 1. Rows drive the columns.
PPCor12345PSpCor12345
1-661551-54873
2 -100111002 -1005100
3 -133 -6100
4 -104 -5
5 -5 -
pdCor12345MI12345
1-449811-94314
2 -10011002 -1003100
3 -51003 -2100
4 -44 -1
5 -5 -
CGCI12345RCGCI12345
1-78841-0010
28-82521-004
346-12311-02
4332-14001-1
54734-51100-
PTENUE12345PDC12345
1-44341-5792
22-36624-354
373-94334-33
4743-34015-2
54533-54434-
Contemporaneous EffectsCausal Effects
PMIME012345PMIME012345
1-254241-528226
25-1005100219-161515
3398-333192-254
44137-3418215-4
5310021-51311711-
PCMCI+12345PCMCI+12345
1-578921-127710
2 -100210029-1289
3 -53363-119
4 -741178-11
5 -59788-
Table 4. Percentage of significant correlations based on selected connectivity measures from 100 realizations of system 2. Rows drive the columns.
Table 4. Percentage of significant correlations based on selected connectivity measures from 100 realizations of system 2. Rows drive the columns.
PPCor12345PSpCor12345
1-101001771-12100178
2 -16962 -1085
3 -933 -83
4 -1004 -100
5 -5 -
pdCor12345MI12345
1-7210028101-129985
2 -96972 -1355
3 -1143 -84
4 -1004 -23
5 -5 -
CGCI12345RCGCI12345
1-20100351-2410022
22-68522-241
3105-44324-02
4847-1004031-100
574384-535080-
PDC12345PTENUE12345
1-15100441-10010044
20-125827-135
343-22365-35
4375235-1004353-44
544192-5353100-
Contemporaneous EffectsCausal Effects
PMIME012345PMIME012345
1-70371-100100813
25-254212-51113
376-3531214-616
4550-13415116-74
57233-515168100-
PCMCI+12345PCMCI+12345
1-67341-1001001910
26-457221-41148
374-3533221-1313
4353-4413714-88
54754-5121210100-
Table 5. Percentage of significant correlations based on selected connectivity measures from 100 realizations of system 3. Rows drive the columns.
Table 5. Percentage of significant correlations based on selected connectivity measures from 100 realizations of system 3. Rows drive the columns.
PPCor12345PSpCor12345
1-1001002381-100100430
2 -1002112 -100510
3 -1001003 -100100
4 -434 -41
5 -5 -
pdCor12345MI12345
1-1000001-10024210
2 -100012 -89619
3 -1001003 -5396
4 -914 -8
5 -5 -
CGCI12345RCGCI12345
1-100274181-1002214
28-10032027-10005
323-100100355-100100
4763-54022-3
57744-52323-
PTENUE12345PDC12345
1-770221-0928892
26-1008320-09692
352-10010036965-00
4490-44949798-98
54304-5858971-
Contemporaneous EffectsCausal Effects
PMIME012345PMIME012345
1-10008101-0132118
2100-155297-1001818
300-75300-100100
4000-24004-26
50003-501122-
PCMCI+12345PCMCI+12345
1-1002751-3923633
2100-183232-1004930
321-2332627-72100
4782-64172161-36
55336-523355233-
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Papana, A. Connectivity Analysis for Multivariate Time Series: Correlation vs. Causality. Entropy 2021, 23, 1570. https://doi.org/10.3390/e23121570

AMA Style

Papana A. Connectivity Analysis for Multivariate Time Series: Correlation vs. Causality. Entropy. 2021; 23(12):1570. https://doi.org/10.3390/e23121570

Chicago/Turabian Style

Papana, Angeliki. 2021. "Connectivity Analysis for Multivariate Time Series: Correlation vs. Causality" Entropy 23, no. 12: 1570. https://doi.org/10.3390/e23121570

APA Style

Papana, A. (2021). Connectivity Analysis for Multivariate Time Series: Correlation vs. Causality. Entropy, 23(12), 1570. https://doi.org/10.3390/e23121570

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop