Association measures

Overview

TypeMeasurePairwiseConditionalFunction version
CorrelationPearsonCorrelationpearson_correlation
CorrelationDistanceCorrelationdistance_correlation
ClosenessSMeasures_measure
ClosenessHMeasureh_measure
ClosenessMMeasurem_measure
Closeness (ranks)LMeasurel_measure
ClosenessJointDistanceDistributionjdd
Cross-mappingPairwiseAsymmetricInferencecrossmap
Cross-mappingConvergentCrossMappingcrossmap
Conditional recurrenceMCRmcr
Conditional recurrenceRMCDrmcd
Shared informationMIShannonmutualinfo
Shared informationMIRenyiJizbamutualinfo
Shared informationMIRenyiSarbumutualinfo
Shared informationMITsallisFuruichimutualinfo
Shared informationPartialCorrelationpartial_correlation
Shared informationCMIShannoncondmutualinfo
Shared informationCMIRenyiSarbucondmutualinfo
Shared informationCMIRenyiJizbacondmutualinfo
Information transferTEShannontransferentropy
Information transferTERenyiJizbatransferentropy
Part mutual informationPMIpmi
Information asymmetryPAasymmetry

Correlation measures

Pearson correlation

CausalityTools.PearsonCorrelationType
PearsonCorrelation

The Pearson correlation of two variables.

Usage

Description

The sample Pearson correlation coefficient for real-valued random variables $X$ and $Y$ with associated samples $\{x_i\}_{i=1}^N$ and $\{y_i\}_{i=1}^N$ is defined as

\[\rho_{xy} = \dfrac{\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y}) }{\sqrt{\sum_{i=1}^N (x_i - \bar{x})^2}\sqrt{\sum_{i=1}^N (y_i - \bar{y})^2}},\]

where $\bar{x}$ and $\bar{y}$ are the means of the observations $x_k$ and $y_k$, respectively.

source

Partial correlation

CausalityTools.PartialCorrelationType
PartialCorrelation <: AssociationMeasure

The correlation of two variables, with the effect of a set of conditioning variables removed.

Usage

  • Use with independence to perform a formal hypothesis test for conditional dependence.
  • Use with partial_correlation to compute the raw correlation coefficient.

Description

There are several ways of estimating the partial correlation. We follow the matrix inversion method, because for StateSpaceSets, we can very efficiently compute the required joint covariance matrix $\Sigma$ for the random variables.

Formally, let $X_1, X_2, \ldots, X_n$ be a set of $n$ real-valued random variables. Consider the joint precision matrix,$P = (p_{ij}) = \Sigma^-1$. The partial correlation of any pair of variables $(X_i, X_j)$, given the remaining variables $\bf{Z} = \{X_k\}_{i=1, i \neq i, j}^n$, is defined as

\[\rho_{X_i X_j | \bf{Z}} = -\dfrac{p_ij}{\sqrt{ p_{ii} p_{jj} }}\]

In practice, we compute the estimate

\[\hat{\rho}_{X_i X_j | \bf{Z}} = -\dfrac{\hat{p}_ij}{\sqrt{ \hat{p}_{ii} \hat{p}_{jj} }},\]

where $\hat{P} = \hat{\Sigma}^{-1}$ is the sample precision matrix.

source

Distance correlation

CausalityTools.DistanceCorrelationType
DistanceCorrelation

The distance correlation (Székely et al., 2007)[Székely2007] measure quantifies potentially nonlinear associations between pairs of variables. If applied to three variables, the partial distance correlation (Székely and Rizzo, 2014)[Székely2014] is computed.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with distance_correlation to compute the raw distance correlation coefficient.
Warn

A partial distance correlation distance_correlation(X, Y, Z) = 0 doesn't always guarantee conditional independence X ⫫ Y | Z. See Székely and Rizzo (2014) for in-depth discussion.

source

Closeness measures

Joint distance distribution

CausalityTools.JointDistanceDistributionType
JointDistanceDistribution <: AssociationMeasure end
JointDistanceDistribution(; metric = Euclidean(), B = 10, D = 2, τ = -1, μ = 0.0)

The joint distance distribution (JDD) measure Amigó and Hirata (2018).

Usage

Keyword arguments

  • distance_metric::Metric: An instance of a valid distance metric from Distances.jl. Defaults to Euclidean().
  • B::Int: The number of equidistant subintervals to divide the interval [0, 1] into when comparing the normalised distances.
  • D::Int: Embedding dimension.
  • τ::Int: Embedding delay. By convention, τ is negative.
  • μ: The hypothetical mean value of the joint distance distribution if there is no coupling between x and y (default is μ = 0.0).

Description

From input time series $x(t)$ and $y(t)$, we first construct the delay embeddings (note the positive sign in the embedding lags; therefore the input parameter τ is by convention negative).

\[\begin{align*} \{\bf{x}_i \} &= \{(x_i, x_{i+\tau}, \ldots, x_{i+(d_x - 1)\tau}) \} \\ \{\bf{y}_i \} &= \{(y_i, y_{i+\tau}, \ldots, y_{i+(d_y - 1)\tau}) \} \\ \end{align*}\]

The algorithm then proceeds to analyze the distribution of distances between points of these embeddings, as described in Amigó and Hirata (2018).

Examples

source

S-measure

CausalityTools.SMeasureType
SMeasure < AssociationMeasure
SMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)

SMeasure is a bivariate association measure from Arnhold et al. (1999) and Quiroga et al. (2000) that measure directional dependence between two input (potentially multivariate) time series.

Note that τx and τy are negative; see explanation below.

Usage

  • Use with independence to perform a formal hypothesis test for directional dependence.
  • Use with s_measure to compute the raw s-measure statistic.

Description

The steps of the algorithm are:

  1. From input time series $x(t)$ and $y(t)$, construct the delay embeddings (note the positive sign in the embedding lags; therefore inputs parameters τx and τy are by convention negative).

\[\begin{align*} \{\bf{x}_i \} &= \{(x_i, x_{i+\tau_x}, \ldots, x_{i+(d_x - 1)\tau_x}) \} \\ \{\bf{y}_i \} &= \{(y_i, y_{i+\tau_y}, \ldots, y_{i+(d_y - 1)\tau_y}) \} \\ \end{align*}\]

  1. Let $r_{i,j}$ and $s_{i,j}$ be the indices of the K-th nearest neighbors of $\bf{x}_i$ and $\bf{y}_i$, respectively. Neighbors closed than w time indices are excluded during searches (i.e. w is the Theiler window).

  2. Compute the the mean squared Euclidean distance to the $K$ nearest neighbors for each $x_i$, using the indices $r_{i, j}$.

\[R_i^{(k)}(x) = \dfrac{1}{k} \sum_{i=1}^{k}(\bf{x}_i, \bf{x}_{r_{i,j}})^2\]

  • Compute the y-conditioned mean squared Euclidean distance to the $K$ nearest neighbors for each $x_i$, now using the indices $s_{i,j}$.

\[R_i^{(k)}(x|y) = \dfrac{1}{k} \sum_{i=1}^{k}(\bf{x}_i, \bf{x}_{s_{i,j}})^2\]

  • Define the following measure of independence, where $0 \leq S \leq 1$, and low values indicate independence and values close to one occur for synchronized signals.

\[S^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \dfrac{R_i^{(k)}(x)}{R_i^{(k)}(x|y)}\]

Input data

The algorithm is slightly modified from Grassberger1999 to allow univariate timeseries as input.

  • If x and y are StateSpaceSets then use x and y as is and ignore the parameters dx/τx and dy/τy.
  • If x and y are scalar time series, then create dx and dy dimensional embeddings, respectively, of both x and y, resulting in N different m-dimensional embedding points $X = \{x_1, x_2, \ldots, x_N \}$ and $Y = \{y_1, y_2, \ldots, y_N \}$. τx and τy control the embedding lags for x and y.
  • If x is a scalar-valued vector and y is a StateSpaceSet, or vice versa, then create an embedding of the scalar timeseries using parameters dx/τx or dy/τy.

In all three cases, input StateSpaceSets are length-matched by eliminating points at the end of the longest StateSpaceSet (after the embedding step, if relevant) before analysis.

source
CausalityTools.s_measureFunction
s_measure(measure::SMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)

Compute the SMeasure from source x to target y.

source
s_measure(measure::SMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet) → s ∈ [0, 1]

Compute the given measure to quantify the directional dependence between univariate/multivariate time series x and y.

Returns a scalar s where s = 0 indicates independence between x and y, and higher values indicate synchronization between x and y, with complete synchronization for s = 1.0.

Example

using CausalityTools

# A two-dimensional Ulam lattice map
sys = ulam(2)

# Sample 1000 points after discarding 5000 transients
orbit = trajectory(sys, 1000, Ttr = 5000)
x, y = orbit[:, 1], orbit[:, 2]

# 4-dimensional embedding for `x`, 5-dimensional embedding for `y`
m = SMeasure(dx = 4, τx = 3, dy = 5, τy = 1)
s_measure(m, x, y)
source

H-measure

CausalityTools.HMeasureType
HMeasure <: AssociationMeasure
HMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)

The HMeasure (Arnhold et al., 1999) is a pairwise association measure. It quantifies the probability with which close state of a target timeseries/embedding are mapped to close states of a source timeseries/embedding.

Note that τx and τy are negative by convention. See docstring for SMeasure for an explanation.

Usage

  • Use with independence to perform a formal hypothesis test for directional dependence.
  • Use with h_measure to compute the raw h-measure statistic.

Description

The HMeasure (Arnhold et al., 1999) is similar to the SMeasure, but the numerator of the formula is replaced by $R_i(x)$, the mean squared Euclidean distance to all other points, and there is a $\log$-term inside the sum:

\[H^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \log \left( \dfrac{R_i(x)}{R_i^{(k)}(x|y)} \right).\]

Parameters are the same and $R_i^{(k)}(x|y)$ is computed as for SMeasure.

source

M-measure

CausalityTools.MMeasureType
MMeasure <: AssociationMeasure
MMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)

The MMeasure (Andrzejak et al., 2003) is a pairwise association measure. It quantifies the probability with which close state of a target timeseries/embedding are mapped to close states of a source timeseries/embedding.

Note that τx and τy are negative by convention. See docstring for SMeasure for an explanation.

Usage

  • Use with independence to perform a formal hypothesis test for directional dependence.
  • Use with m_measure to compute the raw m-measure statistic.

Description

The MMeasure is based on SMeasure and HMeasure. It is given by

\[M^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \log \left( \dfrac{R_i(x) - R_i^{(k)}(x|y)}{R_i(x) - R_i^k(x)} \right),\]

where $R_i(x)$ is computed as for HMeasure, while $R_i^k(x)$ and $R_i^{(k)}(x|y)$ is computed as for SMeasure. Parameters also have the same meaning as for SMeasure/HMeasure.

source

L-measure

CausalityTools.LMeasureType
LMeasure <: AssociationMeasure
LMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)

The LMeasure (Chicharro and Andrzejak, 2009) is a pairwise association measure. It quantifies the probability with which close state of a target timeseries/embedding are mapped to close states of a source timeseries/embedding.

Note that τx and τy are negative by convention. See docstring for SMeasure for an explanation.

Usage

  • Use with independence to perform a formal hypothesis test for directional dependence.
  • Use with l_measure to compute the raw l-measure statistic.

Description

LMeasure is similar to MMeasure, but uses distance ranks instead of the raw distances.

Let $\bf{x_i}$ be an embedding vector, and let $g_{i,j}$ denote the rank that the distance between $\bf{x_i}$ and some other vector $\bf{x_j}$ in a sorted ascending list of distances between $\bf{x_i}$ and $\bf{x_{i \neq j}}$ In other words, $g_{i,j}$ this is just the $N-1$ nearest neighbor distances sorted )

LMeasure is then defined as

\[L^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \log \left( \dfrac{G_i(x) - G_i^{(k)}(x|y)}{G_i(x) - G_i^k(x)} \right),\]

where $G_i(x) = \frac{N}{2}$ and $G_i^K(x) = \frac{k+1}{2}$ are the mean and minimal rank, respectively.

The $y$-conditioned mean rank is defined as

\[G_i^{(k)}(x|y) = \dfrac{1}{K}\sum_{j=1}^{K} g_{i,w_{i, j}},\]

where $w_{i,j}$ is the index of the $j$-th nearest neighbor of $\bf{y_i}$.

source

Cross-map measures

See also the cross mapping API for estimators.

Convergent cross mapping

CausalityTools.ConvergentCrossMappingType
ConvergentCrossMapping <: CrossmapMeasure
ConvergentCrossMapping(; d::Int = 2, τ::Int = -1, w::Int = 0,
    f = Statistics.cor, embed_warn = true)

The convergent cross mapping (CCM) measure (Sugihara et al., 2012)).

Specifies embedding dimension d, embedding lag τ to be used, as described below, with predict or crossmap. The Theiler window w controls how many temporal neighbors are excluded during neighbor searches (w = 0 means that only the point itself is excluded). f is a function that computes the agreement between observations and predictions (the default, f = Statistics.cor, gives the Pearson correlation coefficient).

Embedding

Let S(i) be the source time series variable and T(i) be the target time series variable. This version produces regular embeddings with fixed dimension d and embedding lag τ as follows:

\[( S(i), S(i+\tau), S(i+2\tau), \ldots, S(i+(d-1)\tau, T(i))_{i=1}^{N-(d-1)\tau}.\]

In this joint embedding, neighbor searches are performed in the subspace spanned by the first D-1 variables, while the last (D-th) variable is to be predicted.

With this convention, τ < 0 implies "past/present values of source used to predict target", and τ > 0 implies "future/present values of source used to predict target". The latter case may not be meaningful for many applications, so by default, a warning will be given if τ > 0 (embed_warn = false turns off warnings).

source

Pairwise asymmetric inference

CausalityTools.PairwiseAsymmetricInferenceType
PairwiseAsymmetricInference <: CrossmapMeasure
PairwiseAsymmetricInference(; d::Int = 2, τ::Int = -1, w::Int = 0,
    f = Statistics.cor, embed_warn = true)

The pairwise asymmetric inference (PAI) cross mapping measure (McCracken and Weigel, 2014)) is a version of ConvergentCrossMapping that searches for neighbors in mixed embeddings (i.e. both source and target variables included); otherwise, the algorithms are identical.

Specifies embedding dimension d, embedding lag τ to be used, as described below, with predict or crossmap. The Theiler window w controls how many temporal neighbors are excluded during neighbor searches (w = 0 means that only the point itself is excluded). f is a function that computes the agreement between observations and predictions (the default, f = Statistics.cor, gives the Pearson correlation coefficient).

Embedding

There are many possible ways of defining the embedding for PAI. Currently, we only implement the "add one non-lagged source timeseries to an embedding of the target" approach, which is used as an example in McCracken & Weigel's paper. Specifically: Let S(i) be the source time series variable and T(i) be the target time series variable. PairwiseAsymmetricInference produces regular embeddings with fixed dimension d and embedding lag τ as follows:

\[(S(i), T(i+(d-1)\tau, \ldots, T(i+2\tau), T(i+\tau), T(i)))_{i=1}^{N-(d-1)\tau}.\]

In this joint embedding, neighbor searches are performed in the subspace spanned by the first D variables, while the last variable is to be predicted.

With this convention, τ < 0 implies "past/present values of source used to predict target", and τ > 0 implies "future/present values of source used to predict target". The latter case may not be meaningful for many applications, so by default, a warning will be given if τ > 0 (embed_warn = false turns off warnings).

source

Recurrence-based

CausalityTools.MCRType
MCR <: AssociationMeasure
MCR(; r, metric = Euclidean())

An association measure based on mean conditional probabilities of recurrence (MCR) introduced by Romano et al. (2007).

r is mandatory keyword which specifies the recurrence threshold when constructing recurrence matrices. It can be instance of any subtype of AbstractRecurrenceType from RecurrenceAnalysis.jl. To use any r that is not a real number, you have to do using RecurrenceAnalysis first. The metric is any valid metric from Distances.jl.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise association.
  • Use with mcr to compute the raw MCR for pairwise association.

Description

For input variables X and Y, the conditional probability of recurrence is defined as

\[M(X | Y) = \dfrac{1}{N} \sum_{i=1}^N p(\bf{y_i} | \bf{x_i}) = \dfrac{1}{N} \sum_{i=1}^N \dfrac{\sum_{i=1}^N J_{R_{i, j}}^{X, Y}}{\sum_{i=1}^N R_{i, j}^X},\]

where $R_{i, j}^X$ is the recurrence matrix and $J_{R_{i, j}}^{X, Y}$ is the joint recurrence matrix, constructed using the given metric. The measure $M(Y | X)$ is defined analogously.

Romano et al. (2007)'s interpretation of this quantity is that if X drives Y, then M(X|Y) > M(Y|X), if Y drives X, then M(Y|X) > M(X|Y), and if coupling is symmetric, then M(Y|X) = M(X|Y).

Input data

X and Y can be either both univariate timeseries, or both multivariate StateSpaceSets.

source
CausalityTools.RMCDType
RMCD <: AssociationMeasure
RMCD(; r, metric = Euclidean(), base = 2)

The recurrence measure of conditional dependence, or RMCD (Ramos et al., 2017), is a recurrence-based measure that mimics the conditional mutual information, but uses recurrence probabilities.

r is a mandatory keyword which specifies the recurrence threshold when constructing recurrence matrices. It can be instance of any subtype of AbstractRecurrenceType from RecurrenceAnalysis.jl. To use any r that is not a real number, you have to do using RecurrenceAnalysis first. The metric is any valid metric from Distances.jl.

Both the pairwise and conditional RMCD is non-negative, but due to round-off error, negative values may occur. If that happens, an RMCD value of 0.0 is returned.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise or conditional association.
  • Use with rmcd to compute the raw RMCD for pairwise or conditional association.

Description

The RMCD measure is defined by

\[I_{RMCD}(X; Y | Z) = \dfrac{1}{N} \sum_{i} \left[ \dfrac{1}{N} \sum_{j} R_{ij}^{X, Y, Z} \log \left( \dfrac{\sum_{j} R_{ij}^{X, Y, Z} \sum_{j} R_{ij}^{Z} }{\sum_{j} \sum_{j} R_{ij}^{X, Z} \sum_{j} \sum_{j} R_{ij}^{Y, Z}} \right) \right],\]

where base controls the base of the logarithm. $I_{RMCD}(X; Y | Z)$ is zero when $Z = X$, $Z = Y$ or when $X$, $Y$ and $Z$ are mutually independent.

Our implementation allows dropping the third/last argument, in which case the following mutual information-like quantitity is computed (not discussed in Ramos et al. (2017).

\[ I_{RMCD}(X; Y) = \dfrac{1}{N} \sum_{i} \left[ \dfrac{1}{N} \sum_{j} R_{ij}^{X, Y} \log \left( \dfrac{\sum_{j} R_{ij}^{X} R_{ij}^{Y} }{\sum_{j} R_{ij}^{X, Y}} \right) \right]\]

source

Information measures

Association measures that are information-based are listed here. Available estimators are listed in the information API.

Mutual information (Shannon)

CausalityTools.MIShannonType
MIShannon <: MutualInformation
MIShannon(; base = 2)

The Shannon mutual information $I^S(X; Y)$.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with mutualinfo to compute the raw mutual information.

Discrete definition

There are many equivalent formulations of discrete Shannon mutual information. In this package, we currently use the double-sum and the three-entropies formulations.

Double sum formulation

Assume we observe samples $\bar{\bf{X}}_{1:N_y} = \{\bar{\bf{X}}_1, \ldots, \bar{\bf{X}}_n \}$ and $\bar{\bf{Y}}_{1:N_x} = \{\bar{\bf{Y}}_1, \ldots, \bar{\bf{Y}}_n \}$ from two discrete random variables $X$ and $Y$ with finite supports $\mathcal{X} = \{ x_1, x_2, \ldots, x_{M_x} \}$ and $\mathcal{Y} = y_1, y_2, \ldots, x_{M_y}$. The double-sum estimate is obtained by replacing the double sum

\[\hat{I}_{DS}(X; Y) = \sum_{x_i \in \mathcal{X}, y_i \in \mathcal{Y}} p(x_i, y_j) \log \left( \dfrac{p(x_i, y_i)}{p(x_i)p(y_j)} \right)\]

where $\hat{p}(x_i) = \frac{n(x_i)}{N_x}$, $\hat{p}(y_i) = \frac{n(y_j)}{N_y}$, and $\hat{p}(x_i, x_j) = \frac{n(x_i)}{N}$, and $N = N_x N_y$. This definition is used by mutualinfo when called with a ContingencyMatrix.

Three-entropies formulation

An equivalent formulation of discrete Shannon mutual information is

\[I^S(X; Y) = H^S(X) + H_q^S(Y) - H^S(X, Y),\]

where $H^S(\cdot)$ and $H^S(\cdot, \cdot)$ are the marginal and joint discrete Shannon entropies. This definition is used by mutualinfo when called with a ProbabilitiesEstimator.

Differential mutual information

One possible formulation of differential Shannon mutual information is

\[I^S(X; Y) = h^S(X) + h_q^S(Y) - h^S(X, Y),\]

where $h^S(\cdot)$ and $h^S(\cdot, \cdot)$ are the marginal and joint differential Shannon entropies. This definition is used by mutualinfo when called with a DifferentialEntropyEstimator.

See also: mutualinfo.

source

Mutual information (Tsallis, Furuichi)

CausalityTools.MITsallisFuruichiType
MITsallisFuruichi <: MutualInformation
MITsallisFuruichi(; base = 2, q = 1.5)

The discrete Tsallis mutual information from Furuichi (2006)(Furuichi, 2006), which in that paper is called the mutual entropy.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with mutualinfo to compute the raw mutual information.

Description

Furuichi's Tsallis mutual entropy between variables $X \in \mathbb{R}^{d_X}$ and $Y \in \mathbb{R}^{d_Y}$ is defined as

\[I_q^T(X; Y) = H_q^T(X) - H_q^T(X | Y) = H_q^T(X) + H_q^T(Y) - H_q^T(X, Y),\]

where $H^T(\cdot)$ and $H^T(\cdot, \cdot)$ are the marginal and joint Tsallis entropies, and q is the Tsallis-parameter. ```

See also: mutualinfo.

source

Mutual information (Tsallis, Martin)

CausalityTools.MITsallisMartinType
MITsallisMartin <: MutualInformation
MITsallisMartin(; base = 2, q = 1.5)

The discrete Tsallis mutual information from Martin et al. (2004).

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with mutualinfo to compute the raw mutual information.

Description

Martin et al.'s Tsallis mutual information between variables $X \in \mathbb{R}^{d_X}$ and $Y \in \mathbb{R}^{d_Y}$ is defined as

\[I_{\text{Martin}}^T(X, Y, q) := H_q^T(X) + H_q^T(Y) - (1 - q) H_q^T(X) H_q^T(Y) - H_q(X, Y),\]

where $H^S(\cdot)$ and $H^S(\cdot, \cdot)$ are the marginal and joint Shannon entropies, and q is the Tsallis-parameter.

See also: mutualinfo.

source

Mutual information (Rényi, Sarbu)

CausalityTools.MIRenyiSarbuType
MIRenyiSarbu <: MutualInformation
MIRenyiSarbu(; base = 2, q = 1.5)

The discrete Rényi mutual information from Sarbu (2014).

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with mutualinfo to compute the raw mutual information.

Description

Sarbu (2014) defines discrete Rényi mutual information as the Rényi $\alpha$-divergence between the conditional joint probability mass function $p(x, y)$ and the product of the conditional marginals, $p(x) \cdot p(y)$:

\[I(X, Y)^R_q = \dfrac{1}{q-1} \log \left( \sum_{x \in X, y \in Y} \dfrac{p(x, y)^q}{\left( p(x)\cdot p(y) \right)^{q-1}} \right)\]

See also: mutualinfo.

source

Mutual information (Rényi, Jizba)

CausalityTools.MIRenyiJizbaType
MIRenyiJizba <: MutualInformation

The Rényi mutual information $I_q^{R_{J}}(X; Y)$ defined in (Jizba et al., 2012).

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with mutualinfo to compute the raw mutual information.

Definition

\[I_q^{R_{J}}(X; Y) = S_q^{R}(X) + S_q^{R}(Y) - S_q^{R}(X, Y),\]

where $S_q^{R}(\cdot)$ and $S_q^{R}(\cdot, \cdot)$ the Rényi entropy and the joint Rényi entropy.

source

Conditional mutual information (Shannon)

CausalityTools.CMIShannonType
CMIShannon <: ConditionalMutualInformation
CMIShannon(; base = 2)

The Shannon conditional mutual information (CMI) $I^S(X; Y | Z)$.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with condmutualinfo to compute the raw conditional mutual information.

Supported definitions

Consider random variables $X \in \mathbb{R}^{d_X}$ and $Y \in \mathbb{R}^{d_Y}$, given $Z \in \mathbb{R}^{d_Z}$. The Shannon conditional mutual information is defined as

\[\begin{align*} I(X; Y | Z) &= H^S(X, Z) + H^S(Y, z) - H^S(X, Y, Z) - H^S(Z) \\ &= I^S(X; Y, Z) + I^S(X; Y) \end{align*},\]

where $I^S(\cdot; \cdot)$ is the Shannon mutual information MIShannon, and $H^S(\cdot)$ is the Shannon entropy.

Differential Shannon CMI is obtained by replacing the entropies by differential entropies.

See also: condmutualinfo.

source

Conditional mutual information (Rényi, Jizba)

CausalityTools.CMIRenyiJizbaType
CMIRenyiJizba <: ConditionalMutualInformation

The Rényi conditional mutual information $I_q^{R_{J}}(X; Y | Z$ defined in Jizba et al. (2012).

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with condmutualinfo to compute the raw conditional mutual information.

Definition

\[I_q^{R_{J}}(X; Y | Z) = I_q^{R_{J}}(X; Y, Z) - I_q^{R_{J}}(X; Z),\]

where $I_q^{R_{J}}(X; Z)$ is the MIRenyiJizba mutual information.

source

Conditional mutual information (Rényi, Poczos)

CausalityTools.CMIRenyiPoczosType
CMIRenyiPoczos <: ConditionalMutualInformation

The differential Rényi conditional mutual information $I_q^{R_{P}}(X; Y | Z)$ defined in (Póczos & Schneider, 2012)[Póczos2012].

Usage

  • Use with independence to perform a formal hypothesis test for pairwise dependence.
  • Use with condmutualinfo to compute the raw conditional mutual information.

Definition

\[\begin{align*} I_q^{R_{P}}(X; Y | Z) = \dfrac{1}{q-1} \int \int \int \dfrac{p_Z(z) p_{X, Y | Z}^q}{( p_{X|Z}(x|z) p_{Y|Z}(y|z) )^{q-1}} \\ \mathbb{E}_{X, Y, Z} \sim p_{X, Y, Z} \left[ \dfrac{p_{X, Z}^{1-q}(X, Z) p_{Y, Z}^{1-q}(Y, Z) }{p_{X, Y, Z}^{1-q}(X, Y, Z) p_Z^{1-q}(Z)} \right] \end{align*}\]

source

Transfer entropy (Shannon)

CausalityTools.TEShannonType
TEShannon <: TransferEntropy
TEShannon(; base = 2; embedding = EmbeddingTE()) <: TransferEntropy

The Shannon-type transfer entropy measure.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise and conditional dependence.
  • Use with transferentropy to compute the raw transfer entropy.

Description

The transfer entropy from source $S$ to target $T$, potentially conditioned on $C$ is defined as

\[\begin{align*} TE(S \to T) &:= I^S(T^+; S^- | T^-) \\ TE(S \to T | C) &:= I^S(T^+; S^- | T^-, C^-) \end{align*}\]

where $I(T^+; S^- | T^-)$ is the Shannon conditional mutual information (CMIShannon). The variables $T^+$, $T^-$, $S^-$ and $C^-$ are described in the docstring for transferentropy.

Compatible estimators

Shannon-type transfer entropy can be estimated using a range of different estimators, which all boil down to computing conditional mutual information, except for TransferEntropyEstimator, which compute transfer entropy using some direct method.

EstimatorTypePrincipleTEShannon
CountOccurrencesProbabilitiesEstimatorFrequencies
ValueHistogramProbabilitiesEstimatorBinning (histogram)
SymbolicPermuationProbabilitiesEstimatorOrdinal patterns
DispersionProbabilitiesEstimatorDispersion patterns
KraskovDifferentialEntropyEstimatorNearest neighbors
ZhuDifferentialEntropyEstimatorNearest neighbors
ZhuSinghDifferentialEntropyEstimatorNearest neighbors
GaoDifferentialEntropyEstimatorNearest neighbors
GoriaDifferentialEntropyEstimatorNearest neighbors
LordDifferentialEntropyEstimatorNearest neighbors
LeonenkoProzantoSavaniDifferentialEntropyEstimatorNearest neighbors
GaussanMIMutualInformationEstimatorParametric
KSG1MutualInformationEstimatorContinuous
KSG2MutualInformationEstimatorContinuous
GaoKannanOhViswanathMutualInformationEstimatorMixed
GaoOhViswanathMutualInformationEstimatorContinuous
FPVPConditionalMutualInformationEstimatorNearest neighbors
MesnerShaliziConditionalMutualInformationEstimatorNearest neighbors
RahimzamaniConditionalMutualInformationEstimatorNearest neighbors
Zhu1TransferEntropyEstimatorNearest neighbors
LindnerTransferEntropyEstimatorNearest neighbors
source

Transfer entropy (Rényi, Jizba)

CausalityTools.TERenyiJizbaType
TERenyiJizba() <: TransferEntropy

The Rényi transfer entropy from Jizba et al. (2012).

Usage

  • Use with independence to perform a formal hypothesis test for pairwise and conditional dependence.
  • Use with transferentropy to compute the raw transfer entropy.

Description

The transfer entropy from source $S$ to target $T$, potentially conditioned on $C$ is defined as

\[\begin{align*} TE(S \to T) &:= I_q^{R_J}(T^+; S^- | T^-) \\ TE(S \to T | C) &:= I_q^{R_J}(T^+; S^- | T^-, C^-), \end{align*},\]

where $I_q^{R_J}(T^+; S^- | T^-)$ is Jizba et al. (2012)'s definition of conditional mutual information (CMIRenyiJizba). The variables $T^+$, $T^-$, $S^-$ and $C^-$ are described in the docstring for transferentropy.

Compatible estimators

Jizba's formulation of Renyi-type transfer entropy can currently be estimated using selected probabilities estimators and differential entropy estimators, which under the hood compute the transfer entropy as Jizba's formulation of Rényi conditional mutual information.

EstimatorTypePrincipleTERenyiJizba
CountOccurrencesProbabilitiesEstimatorFrequencies
ValueHistogramProbabilitiesEstimatorBinning (histogram)
LeonenkoProzantoSavaniDifferentialEntropyEstimatorNearest neighbors
source

Part mutual information

CausalityTools.PMIType
PMI <: AssociationMeasure
PMI(; base = 2)

The partial mutual information (PMI) measure of association (Zhao et al., 2016).

Definition

PMI is defined as for variables $X$, $Y$ and $Z$ as

\[PMI(X; Y | Z) = D(p(x, y, z) || p^{*}(x|z) p^{*}(y|z) p(z)),\]

where $p(x, y, z)$ is the joint distribution for $X$, $Y$ and $Z$, and $D(\cdot, \cdot)$ is the extended Kullback-Leibler divergence from $p(x, y, z)$ to $p^{*}(x|z) p^{*}(y|z) p(z)$. See Zhao et al. (2016) for details.

Estimation

PMI can be estimated using any ProbabilitiesEstimator that implements marginal_encodings. This allows estimation of 3D contingency matrices, from which relevant probabilities for the PMI formula are extracted. See also pmi.

Properties

For the discrete case, the following identities hold in theory (when estimating PMI, they may not).

  • PMI(X, Y, Z) >= CMI(X, Y, Z) (where CMI is the Shannon CMI). Holds in theory, but when estimating PMI, the identity may not hold.
  • PMI(X, Y, Z) >= 0. Holds both in theory and for estimation using ProbabilitiesEstimators.
  • X ⫫ Y | Z => PMI(X, Y, Z) = CMI(X, Y, Z) = 0 (in theory, but not necessarily for estimation).
source

Predictive asymmetry

CausalityTools.PAType
PA <: CausalityTools.AssociationMeasure
PA(ηT = 1:5, τS = 1, τC = 1)

The modified predictive asymmetry measure (Haaga et al., in revision).

Note

This is an experimental measure. It is part of an ongoing paper submission revision, but is provided here for convenience.

Usage

  • Use with independence to perform a formal hypothesis test for pairwise or conditional directional dependence.
  • Use with asymmetry to compute the raw asymmetry distribution.

Keyword arguments

  • ηT. The prediction lags for the target variable.
  • τS. The embedding delay(s) for the source variable.
  • τC. The embedding delay(s) for the conditional variable(s).

All parameters are given as a single integer or multiple integers greater than zero.

Compatible estimators

PA/asymmetry uses condmutualinfo under the hood. Any estimator that can be used for ConditionalMutualInformation can therefore, in principle, be used with the predictive asymmetry. We recommend to use FPVP, or one of the other dedicated conditional mutual information estimators.

EstimatorTypePrinciplePairwiseConditional
CountOccurrencesProbabilitiesEstimatorFrequencies
ValueHistogramProbabilitiesEstimatorBinning (histogram)
DispersionProbabilitiesEstimatorDispersion patterns
KraskovDifferentialEntropyEstimatorNearest neighbors
ZhuDifferentialEntropyEstimatorNearest neighbors
ZhuSinghDifferentialEntropyEstimatorNearest neighbors
GaoDifferentialEntropyEstimatorNearest neighbors
GoriaDifferentialEntropyEstimatorNearest neighbors
LordDifferentialEntropyEstimatorNearest neighbors
LeonenkoProzantoSavaniDifferentialEntropyEstimatorNearest neighbors
GaussanMIMutualInformationEstimatorParametric
KSG1MutualInformationEstimatorContinuous
KSG2MutualInformationEstimatorContinuous
GaoKannanOhViswanathMutualInformationEstimatorMixed
GaoOhViswanathMutualInformationEstimatorContinuous
FPVPConditionalMutualInformationEstimatorNearest neighbors
MesnerShaliziConditionalMutualInformationEstimatorNearest neighbors
RahimzamaniConditionalMutualInformationEstimatorNearest neighbors

Examples

source
  • Székely2007Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
  • Székely2014Székely, G. J., & Rizzo, M. L. (2014). Partial distance correlation with methods for dissimilarities.
  • Székely2007Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
  • Székely2014Székely, G. J., & Rizzo, M. L. (2014). Partial distance correlation with methods for dissimilarities.
  • Póczos2012Póczos, B., & Schneider, J. (2012, March). Nonparametric estimation of conditional information and divergences. In Artificial Intelligence and Statistics (pp. 914-923). PMLR.