Association measures

CausalityTools.AssociationMeasure — Type

AssociationMeasure

The supertype of all association measures.

source

Overview

Type	Measure	Pairwise	Conditional	Function version
Correlation	`PearsonCorrelation`	✓	✖	`pearson_correlation`
Correlation	`DistanceCorrelation`	✓	✓	`distance_correlation`
Closeness	`SMeasure`	✓	✖	`s_measure`
Closeness	`HMeasure`	✓	✖	`h_measure`
Closeness	`MMeasure`	✓	✖	`m_measure`
Closeness (ranks)	`LMeasure`	✓	✖	`l_measure`
Closeness	`JointDistanceDistribution`	✓	✖	`jdd`
Cross-mapping	`PairwiseAsymmetricInference`	✓	✖	`crossmap`
Cross-mapping	`ConvergentCrossMapping`	✓	✖	`crossmap`
Conditional recurrence	`MCR`	✓	✖	`mcr`
Conditional recurrence	`RMCD`	✓	✓	`rmcd`
Shared information	`MIShannon`	✓	✖	`mutualinfo`
Shared information	`MIRenyiJizba`	✓	✖	`mutualinfo`
Shared information	`MIRenyiSarbu`	✓	✖	`mutualinfo`
Shared information	`MITsallisFuruichi`	✓	✖	`mutualinfo`
Shared information	`PartialCorrelation`	✖	✓	`partial_correlation`
Shared information	`CMIShannon`	✖	✓	`condmutualinfo`
Shared information	`CMIRenyiSarbu`	✖	✓	`condmutualinfo`
Shared information	`CMIRenyiJizba`	✖	✓	`condmutualinfo`
Information transfer	`TEShannon`	✓	✓	`transferentropy`
Information transfer	`TERenyiJizba`	✓	✓	`transferentropy`
Part mutual information	`PMI`	✖	✓	`pmi`
Information asymmetry	`PA`	✓	✓	`asymmetry`

Correlation measures

Pearson correlation

CausalityTools.PearsonCorrelation — Type

PearsonCorrelation

The Pearson correlation of two variables.

Usage

Use with independence to perform a formal hypothesis test for pairwise dependence.
Use with pearson_correlation to compute the raw correlation coefficient.

Description

The sample Pearson correlation coefficient for real-valued random variables $X$ and $Y$ with associated samples $\{x_i\}_{i=1}^N$ and $\{y_i\}_{i=1}^N$ is defined as

\[\rho_{xy} = \dfrac{\sum_{i=1}^n (x_i - \bar{x})(y_i - \bar{y}) }{\sqrt{\sum_{i=1}^N (x_i - \bar{x})^2}\sqrt{\sum_{i=1}^N (y_i - \bar{y})^2}},\]

where $\bar{x}$ and $\bar{y}$ are the means of the observations $x_k$ and $y_k$, respectively.

source

CausalityTools.pearson_correlation — Function

pearson_correlation(x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)

Compute the PearsonCorrelation between x and y, which must each be 1-dimensional.

source

Partial correlation

CausalityTools.PartialCorrelation — Type

PartialCorrelation <: AssociationMeasure

The correlation of two variables, with the effect of a set of conditioning variables removed.

Usage

Use with independence to perform a formal hypothesis test for conditional dependence.
Use with partial_correlation to compute the raw correlation coefficient.

Description

There are several ways of estimating the partial correlation. We follow the matrix inversion method, because for StateSpaceSets, we can very efficiently compute the required joint covariance matrix $\Sigma$ for the random variables.

Formally, let $X_1, X_2, \ldots, X_n$ be a set of $n$ real-valued random variables. Consider the joint precision matrix,$P = (p_{ij}) = \Sigma^-1$. The partial correlation of any pair of variables $(X_i, X_j)$, given the remaining variables $\bf{Z} = \{X_k\}_{i=1, i \neq i, j}^n$, is defined as

\[\rho_{X_i X_j | \bf{Z}} = -\dfrac{p_ij}{\sqrt{ p_{ii} p_{jj} }}\]

In practice, we compute the estimate

\[\hat{\rho}_{X_i X_j | \bf{Z}} = -\dfrac{\hat{p}_ij}{\sqrt{ \hat{p}_{ii} \hat{p}_{jj} }},\]

where $\hat{P} = \hat{\Sigma}^{-1}$ is the sample precision matrix.

source

CausalityTools.partial_correlation — Function

partial_correlation(x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet,
    z::VectorOrStateSpaceSet...)

Compute the PartialCorrelation between x and y, given z.

source

Distance correlation

CausalityTools.DistanceCorrelation — Type

DistanceCorrelation

The distance correlation (Székely et al., 2007)^{[Székely2007]} measure quantifies potentially nonlinear associations between pairs of variables. If applied to three variables, the partial distance correlation (Székely and Rizzo, 2014)^{[Székely2014]} is computed.

Usage

Use with independence to perform a formal hypothesis test for pairwise dependence.
Use with distance_correlation to compute the raw distance correlation coefficient.

Warn

A partial distance correlation distance_correlation(X, Y, Z) = 0 doesn't always guarantee conditional independence X ⫫ Y | Z. See Székely and Rizzo (2014) for in-depth discussion.

source

CausalityTools.distance_correlation — Function

distance_correlation(x, y) → dcor ∈ [0, 1]
distance_correlation(x, y, z) → pdcor

Compute the empirical/sample distance correlation (Székely et al., 2007)^{[Székely2007]}, here called dcor, between StateSpaceSets x and y. Alternatively, compute the partial distance correlation pdcor (Székely and Rizzo, 2014)^{[Székely2014]}.

Closeness measures

Joint distance distribution

CausalityTools.JointDistanceDistribution — Type

JointDistanceDistribution <: AssociationMeasure end
JointDistanceDistribution(; metric = Euclidean(), B = 10, D = 2, τ = -1, μ = 0.0)

The joint distance distribution (JDD) measure Amigó and Hirata (2018).

Usage

Use with independence to perform a formal hypothesis test for directional dependence.
Use with jdd to compute the joint distance distribution Δ from Amigó and Hirata (2018)

Keyword arguments

distance_metric::Metric: An instance of a valid distance metric from Distances.jl. Defaults to Euclidean().
B::Int: The number of equidistant subintervals to divide the interval [0, 1] into when comparing the normalised distances.
D::Int: Embedding dimension.
τ::Int: Embedding delay. By convention, τ is negative.
μ: The hypothetical mean value of the joint distance distribution if there is no coupling between x and y (default is μ = 0.0).

Description

From input time series $x(t)$ and $y(t)$, we first construct the delay embeddings (note the positive sign in the embedding lags; therefore the input parameter τ is by convention negative).

\[\begin{align*} \{\bf{x}_i \} &= \{(x_i, x_{i+\tau}, \ldots, x_{i+(d_x - 1)\tau}) \} \\ \{\bf{y}_i \} &= \{(y_i, y_{i+\tau}, \ldots, y_{i+(d_y - 1)\tau}) \} \\ \end{align*}\]

The algorithm then proceeds to analyze the distribution of distances between points of these embeddings, as described in Amigó and Hirata (2018).

Examples

source

CausalityTools.jdd — Function

jdd(measure::JointDistanceDistribution, source, target) → Δ

Compute the joint distance distribution (Amigó and Hirata, 2018) from source to target using the given JointDistanceDistribution measure.

Returns the distribution Δ from the paper directly (example). Use JointDistanceDistributionTest to perform a formal indepencence test.

source

S-measure

CausalityTools.SMeasure — Type

SMeasure < AssociationMeasure
SMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)

SMeasure is a bivariate association measure from Arnhold et al. (1999) and Quiroga et al. (2000) that measure directional dependence between two input (potentially multivariate) time series.

Note that τx and τy are negative; see explanation below.

Usage

Use with independence to perform a formal hypothesis test for directional dependence.
Use with s_measure to compute the raw s-measure statistic.

Description

The steps of the algorithm are:

From input time series $x(t)$ and $y(t)$, construct the delay embeddings (note the positive sign in the embedding lags; therefore inputs parameters τx and τy are by convention negative).

\[\begin{align*} \{\bf{x}_i \} &= \{(x_i, x_{i+\tau_x}, \ldots, x_{i+(d_x - 1)\tau_x}) \} \\ \{\bf{y}_i \} &= \{(y_i, y_{i+\tau_y}, \ldots, y_{i+(d_y - 1)\tau_y}) \} \\ \end{align*}\]

Let $r_{i,j}$ and $s_{i,j}$ be the indices of the K-th nearest neighbors of $\bf{x}_i$ and $\bf{y}_i$, respectively. Neighbors closed than w time indices are excluded during searches (i.e. w is the Theiler window).
Compute the the mean squared Euclidean distance to the $K$ nearest neighbors for each $x_i$, using the indices $r_{i, j}$.

\[R_i^{(k)}(x) = \dfrac{1}{k} \sum_{i=1}^{k}(\bf{x}_i, \bf{x}_{r_{i,j}})^2\]

Compute the y-conditioned mean squared Euclidean distance to the $K$ nearest neighbors for each $x_i$, now using the indices $s_{i,j}$.

\[R_i^{(k)}(x|y) = \dfrac{1}{k} \sum_{i=1}^{k}(\bf{x}_i, \bf{x}_{s_{i,j}})^2\]

Define the following measure of independence, where $0 \leq S \leq 1$, and low values indicate independence and values close to one occur for synchronized signals.

\[S^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \dfrac{R_i^{(k)}(x)}{R_i^{(k)}(x|y)}\]

Input data

The algorithm is slightly modified from Grassberger1999 to allow univariate timeseries as input.

If x and y are StateSpaceSets then use x and y as is and ignore the parameters dx/τx and dy/τy.
If x and y are scalar time series, then create dx and dy dimensional embeddings, respectively, of both x and y, resulting in N different m-dimensional embedding points $X = \{x_1, x_2, \ldots, x_N \}$ and $Y = \{y_1, y_2, \ldots, y_N \}$. τx and τy control the embedding lags for x and y.
If x is a scalar-valued vector and y is a StateSpaceSet, or vice versa, then create an embedding of the scalar timeseries using parameters dx/τx or dy/τy.

In all three cases, input StateSpaceSets are length-matched by eliminating points at the end of the longest StateSpaceSet (after the embedding step, if relevant) before analysis.

source

CausalityTools.s_measure — Function

s_measure(measure::SMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)

Compute the SMeasure from source x to target y.

source

s_measure(measure::SMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet) → s ∈ [0, 1]

Compute the given measure to quantify the directional dependence between univariate/multivariate time series x and y.

Returns a scalar s where s = 0 indicates independence between x and y, and higher values indicate synchronization between x and y, with complete synchronization for s = 1.0.

Example

using CausalityTools

# A two-dimensional Ulam lattice map
sys = ulam(2)

# Sample 1000 points after discarding 5000 transients
orbit = trajectory(sys, 1000, Ttr = 5000)
x, y = orbit[:, 1], orbit[:, 2]

# 4-dimensional embedding for `x`, 5-dimensional embedding for `y`
m = SMeasure(dx = 4, τx = 3, dy = 5, τy = 1)
s_measure(m, x, y)

source

H-measure

CausalityTools.HMeasure — Type

HMeasure <: AssociationMeasure
HMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)

The HMeasure (Arnhold et al., 1999) is a pairwise association measure. It quantifies the probability with which close state of a target timeseries/embedding are mapped to close states of a source timeseries/embedding.

Note that τx and τy are negative by convention. See docstring for SMeasure for an explanation.

Usage

Use with independence to perform a formal hypothesis test for directional dependence.
Use with h_measure to compute the raw h-measure statistic.

Description

The HMeasure (Arnhold et al., 1999) is similar to the SMeasure, but the numerator of the formula is replaced by $R_i(x)$, the mean squared Euclidean distance to all other points, and there is a $\log$-term inside the sum:

\[H^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \log \left( \dfrac{R_i(x)}{R_i^{(k)}(x|y)} \right).\]

Parameters are the same and $R_i^{(k)}(x|y)$ is computed as for SMeasure.

source

CausalityTools.h_measure — Function

h_measure(measure::HMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)

Compute the HMeasure from source x to target y.

source

M-measure

CausalityTools.MMeasure — Type

MMeasure <: AssociationMeasure
MMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)

The MMeasure (Andrzejak et al., 2003) is a pairwise association measure. It quantifies the probability with which close state of a target timeseries/embedding are mapped to close states of a source timeseries/embedding.

Note that τx and τy are negative by convention. See docstring for SMeasure for an explanation.

Usage

Use with independence to perform a formal hypothesis test for directional dependence.
Use with m_measure to compute the raw m-measure statistic.

Description

The MMeasure is based on SMeasure and HMeasure. It is given by

\[M^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \log \left( \dfrac{R_i(x) - R_i^{(k)}(x|y)}{R_i(x) - R_i^k(x)} \right),\]

where $R_i(x)$ is computed as for HMeasure, while $R_i^k(x)$ and $R_i^{(k)}(x|y)$ is computed as for SMeasure. Parameters also have the same meaning as for SMeasure/HMeasure.

source

CausalityTools.m_measure — Function

m_measure(measure::MMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)

Compute the MMeasure from source x to target y.

source

L-measure

CausalityTools.LMeasure — Type

LMeasure <: AssociationMeasure
LMeasure(; K::Int = 2, dx = 2, dy = 2, τx = - 1, τy = -1, w = 0)

The LMeasure (Chicharro and Andrzejak, 2009) is a pairwise association measure. It quantifies the probability with which close state of a target timeseries/embedding are mapped to close states of a source timeseries/embedding.

Note that τx and τy are negative by convention. See docstring for SMeasure for an explanation.

Usage

Use with independence to perform a formal hypothesis test for directional dependence.
Use with l_measure to compute the raw l-measure statistic.

Description

LMeasure is similar to MMeasure, but uses distance ranks instead of the raw distances.

Let $\bf{x_i}$ be an embedding vector, and let $g_{i,j}$ denote the rank that the distance between $\bf{x_i}$ and some other vector $\bf{x_j}$ in a sorted ascending list of distances between $\bf{x_i}$ and $\bf{x_{i \neq j}}$ In other words, $g_{i,j}$ this is just the $N-1$ nearest neighbor distances sorted )

LMeasure is then defined as

\[L^{(k)}(x|y) = \dfrac{1}{N} \sum_{i=1}^{N} \log \left( \dfrac{G_i(x) - G_i^{(k)}(x|y)}{G_i(x) - G_i^k(x)} \right),\]

where $G_i(x) = \frac{N}{2}$ and $G_i^K(x) = \frac{k+1}{2}$ are the mean and minimal rank, respectively.

The $y$-conditioned mean rank is defined as

\[G_i^{(k)}(x|y) = \dfrac{1}{K}\sum_{j=1}^{K} g_{i,w_{i, j}},\]

where $w_{i,j}$ is the index of the $j$-th nearest neighbor of $\bf{y_i}$.

source

CausalityTools.l_measure — Function

l_measure(measure::LMeasure, x::VectorOrStateSpaceSet, y::VectorOrStateSpaceSet)

Compute the LMeasure from source x to target y.

source

Cross-map measures

See also the cross mapping API for estimators.

Convergent cross mapping

CausalityTools.ConvergentCrossMapping — Type

ConvergentCrossMapping <: CrossmapMeasure
ConvergentCrossMapping(; d::Int = 2, τ::Int = -1, w::Int = 0,
    f = Statistics.cor, embed_warn = true)

The convergent cross mapping (CCM) measure (Sugihara et al., 2012)).

Specifies embedding dimension d, embedding lag τ to be used, as described below, with predict or crossmap. The Theiler window w controls how many temporal neighbors are excluded during neighbor searches (w = 0 means that only the point itself is excluded). f is a function that computes the agreement between observations and predictions (the default, f = Statistics.cor, gives the Pearson correlation coefficient).

Embedding

Let S(i) be the source time series variable and T(i) be the target time series variable. This version produces regular embeddings with fixed dimension d and embedding lag τ as follows:

\[( S(i), S(i+\tau), S(i+2\tau), \ldots, S(i+(d-1)\tau, T(i))_{i=1}^{N-(d-1)\tau}.\]

In this joint embedding, neighbor searches are performed in the subspace spanned by the first D-1 variables, while the last (D-th) variable is to be predicted.

With this convention, τ < 0 implies "past/present values of source used to predict target", and τ > 0 implies "future/present values of source used to predict target". The latter case may not be meaningful for many applications, so by default, a warning will be given if τ > 0 (embed_warn = false turns off warnings).

source

Pairwise asymmetric inference

CausalityTools.PairwiseAsymmetricInference — Type

PairwiseAsymmetricInference <: CrossmapMeasure
PairwiseAsymmetricInference(; d::Int = 2, τ::Int = -1, w::Int = 0,
    f = Statistics.cor, embed_warn = true)

The pairwise asymmetric inference (PAI) cross mapping measure (McCracken and Weigel, 2014)) is a version of ConvergentCrossMapping that searches for neighbors in mixed embeddings (i.e. both source and target variables included); otherwise, the algorithms are identical.

Embedding

There are many possible ways of defining the embedding for PAI. Currently, we only implement the "add one non-lagged source timeseries to an embedding of the target" approach, which is used as an example in McCracken & Weigel's paper. Specifically: Let S(i) be the source time series variable and T(i) be the target time series variable. PairwiseAsymmetricInference produces regular embeddings with fixed dimension d and embedding lag τ as follows:

\[(S(i), T(i+(d-1)\tau, \ldots, T(i+2\tau), T(i+\tau), T(i)))_{i=1}^{N-(d-1)\tau}.\]

In this joint embedding, neighbor searches are performed in the subspace spanned by the first D variables, while the last variable is to be predicted.

source

Recurrence-based

CausalityTools.MCR — Type

MCR <: AssociationMeasure
MCR(; r, metric = Euclidean())

An association measure based on mean conditional probabilities of recurrence (MCR) introduced by Romano et al. (2007).

r is mandatory keyword which specifies the recurrence threshold when constructing recurrence matrices. It can be instance of any subtype of AbstractRecurrenceType from RecurrenceAnalysis.jl. To use any r that is not a real number, you have to do using RecurrenceAnalysis first. The metric is any valid metric from Distances.jl.

Usage

Use with independence to perform a formal hypothesis test for pairwise association.
Use with mcr to compute the raw MCR for pairwise association.

Description

For input variables X and Y, the conditional probability of recurrence is defined as

\[M(X | Y) = \dfrac{1}{N} \sum_{i=1}^N p(\bf{y_i} | \bf{x_i}) = \dfrac{1}{N} \sum_{i=1}^N \dfrac{\sum_{i=1}^N J_{R_{i, j}}^{X, Y}}{\sum_{i=1}^N R_{i, j}^X},\]

where $R_{i, j}^X$ is the recurrence matrix and $J_{R_{i, j}}^{X, Y}$ is the joint recurrence matrix, constructed using the given metric. The measure $M(Y | X)$ is defined analogously.

Romano et al. (2007)'s interpretation of this quantity is that if X drives Y, then M(X|Y) > M(Y|X), if Y drives X, then M(Y|X) > M(X|Y), and if coupling is symmetric, then M(Y|X) = M(X|Y).

Input data

X and Y can be either both univariate timeseries, or both multivariate StateSpaceSets.

source

CausalityTools.RMCD — Type

RMCD <: AssociationMeasure
RMCD(; r, metric = Euclidean(), base = 2)

The recurrence measure of conditional dependence, or RMCD (Ramos et al., 2017), is a recurrence-based measure that mimics the conditional mutual information, but uses recurrence probabilities.

r is a mandatory keyword which specifies the recurrence threshold when constructing recurrence matrices. It can be instance of any subtype of AbstractRecurrenceType from RecurrenceAnalysis.jl. To use any r that is not a real number, you have to do using RecurrenceAnalysis first. The metric is any valid metric from Distances.jl.

Both the pairwise and conditional RMCD is non-negative, but due to round-off error, negative values may occur. If that happens, an RMCD value of 0.0 is returned.

Usage

Use with independence to perform a formal hypothesis test for pairwise or conditional association.
Use with rmcd to compute the raw RMCD for pairwise or conditional association.

Description

The RMCD measure is defined by

\[I_{RMCD}(X; Y | Z) = \dfrac{1}{N} \sum_{i} \left[ \dfrac{1}{N} \sum_{j} R_{ij}^{X, Y, Z} \log \left( \dfrac{\sum_{j} R_{ij}^{X, Y, Z} \sum_{j} R_{ij}^{Z} }{\sum_{j} \sum_{j} R_{ij}^{X, Z} \sum_{j} \sum_{j} R_{ij}^{Y, Z}} \right) \right],\]

where base controls the base of the logarithm. $I_{RMCD}(X; Y | Z)$ is zero when $Z = X$, $Z = Y$ or when $X$, $Y$ and $Z$ are mutually independent.

Our implementation allows dropping the third/last argument, in which case the following mutual information-like quantitity is computed (not discussed in Ramos et al. (2017).

\[ I_{RMCD}(X; Y) = \dfrac{1}{N} \sum_{i} \left[ \dfrac{1}{N} \sum_{j} R_{ij}^{X, Y} \log \left( \dfrac{\sum_{j} R_{ij}^{X} R_{ij}^{Y} }{\sum_{j} R_{ij}^{X, Y}} \right) \right]\]

source

Information measures

Association measures that are information-based are listed here. Available estimators are listed in the information API.

Mutual information (Shannon)

CausalityTools.MIShannon — Type

MIShannon <: MutualInformation
MIShannon(; base = 2)

The Shannon mutual information $I^S(X; Y)$.

Usage

Use with independence to perform a formal hypothesis test for pairwise dependence.
Use with mutualinfo to compute the raw mutual information.

Discrete definition

There are many equivalent formulations of discrete Shannon mutual information. In this package, we currently use the double-sum and the three-entropies formulations.

Double sum formulation

Assume we observe samples $\bar{\bf{X}}_{1:N_y} = \{\bar{\bf{X}}_1, \ldots, \bar{\bf{X}}_n \}$ and $\bar{\bf{Y}}_{1:N_x} = \{\bar{\bf{Y}}_1, \ldots, \bar{\bf{Y}}_n \}$ from two discrete random variables $X$ and $Y$ with finite supports $\mathcal{X} = \{ x_1, x_2, \ldots, x_{M_x} \}$ and $\mathcal{Y} = y_1, y_2, \ldots, x_{M_y}$. The double-sum estimate is obtained by replacing the double sum

\[\hat{I}_{DS}(X; Y) = \sum_{x_i \in \mathcal{X}, y_i \in \mathcal{Y}} p(x_i, y_j) \log \left( \dfrac{p(x_i, y_i)}{p(x_i)p(y_j)} \right)\]

where $\hat{p}(x_i) = \frac{n(x_i)}{N_x}$, $\hat{p}(y_i) = \frac{n(y_j)}{N_y}$, and $\hat{p}(x_i, x_j) = \frac{n(x_i)}{N}$, and $N = N_x N_y$. This definition is used by mutualinfo when called with a ContingencyMatrix.

Three-entropies formulation

An equivalent formulation of discrete Shannon mutual information is

\[I^S(X; Y) = H^S(X) + H_q^S(Y) - H^S(X, Y),\]

where $H^S(\cdot)$ and $H^S(\cdot, \cdot)$ are the marginal and joint discrete Shannon entropies. This definition is used by mutualinfo when called with a ProbabilitiesEstimator.

Differential mutual information

One possible formulation of differential Shannon mutual information is

\[I^S(X; Y) = h^S(X) + h_q^S(Y) - h^S(X, Y),\]

where $h^S(\cdot)$ and $h^S(\cdot, \cdot)$ are the marginal and joint differential Shannon entropies. This definition is used by mutualinfo when called with a DifferentialEntropyEstimator.

See also: mutualinfo.

source

Mutual information (Tsallis, Furuichi)

CausalityTools.MITsallisFuruichi — Type

MITsallisFuruichi <: MutualInformation
MITsallisFuruichi(; base = 2, q = 1.5)

The discrete Tsallis mutual information from Furuichi (2006)(Furuichi, 2006), which in that paper is called the mutual entropy.

Usage

Use with independence to perform a formal hypothesis test for pairwise dependence.
Use with mutualinfo to compute the raw mutual information.

Description

Furuichi's Tsallis mutual entropy between variables $X \in \mathbb{R}^{d_X}$ and $Y \in \mathbb{R}^{d_Y}$ is defined as

\[I_q^T(X; Y) = H_q^T(X) - H_q^T(X | Y) = H_q^T(X) + H_q^T(Y) - H_q^T(X, Y),\]

where $H^T(\cdot)$ and $H^T(\cdot, \cdot)$ are the marginal and joint Tsallis entropies, and q is the Tsallis-parameter. ```

See also: mutualinfo.

source

Mutual information (Tsallis, Martin)

CausalityTools.MITsallisMartin — Type

MITsallisMartin <: MutualInformation
MITsallisMartin(; base = 2, q = 1.5)

The discrete Tsallis mutual information from Martin et al. (2004).

Usage

Use with independence to perform a formal hypothesis test for pairwise dependence.
Use with mutualinfo to compute the raw mutual information.

Description

Martin et al.'s Tsallis mutual information between variables $X \in \mathbb{R}^{d_X}$ and $Y \in \mathbb{R}^{d_Y}$ is defined as

\[I_{\text{Martin}}^T(X, Y, q) := H_q^T(X) + H_q^T(Y) - (1 - q) H_q^T(X) H_q^T(Y) - H_q(X, Y),\]

where $H^S(\cdot)$ and $H^S(\cdot, \cdot)$ are the marginal and joint Shannon entropies, and q is the Tsallis-parameter.

See also: mutualinfo.

source

Mutual information (Rényi, Sarbu)

CausalityTools.MIRenyiSarbu — Type

MIRenyiSarbu <: MutualInformation
MIRenyiSarbu(; base = 2, q = 1.5)

The discrete Rényi mutual information from Sarbu (2014).

Usage

Use with independence to perform a formal hypothesis test for pairwise dependence.
Use with mutualinfo to compute the raw mutual information.

Description

Sarbu (2014) defines discrete Rényi mutual information as the Rényi $\alpha$-divergence between the conditional joint probability mass function $p(x, y)$ and the product of the conditional marginals, $p(x) \cdot p(y)$:

\[I(X, Y)^R_q = \dfrac{1}{q-1} \log \left( \sum_{x \in X, y \in Y} \dfrac{p(x, y)^q}{\left( p(x)\cdot p(y) \right)^{q-1}} \right)\]

See also: mutualinfo.

source

Mutual information (Rényi, Jizba)

CausalityTools.MIRenyiJizba — Type

MIRenyiJizba <: MutualInformation

The Rényi mutual information $I_q^{R_{J}}(X; Y)$ defined in (Jizba et al., 2012).

Usage

Use with independence to perform a formal hypothesis test for pairwise dependence.
Use with mutualinfo to compute the raw mutual information.

Definition

\[I_q^{R_{J}}(X; Y) = S_q^{R}(X) + S_q^{R}(Y) - S_q^{R}(X, Y),\]

where $S_q^{R}(\cdot)$ and $S_q^{R}(\cdot, \cdot)$ the Rényi entropy and the joint Rényi entropy.

source

Conditional mutual information (Shannon)

CausalityTools.CMIShannon — Type

CMIShannon <: ConditionalMutualInformation
CMIShannon(; base = 2)

The Shannon conditional mutual information (CMI) $I^S(X; Y | Z)$.

Usage

Use with independence to perform a formal hypothesis test for pairwise dependence.
Use with condmutualinfo to compute the raw conditional mutual information.

Supported definitions

Consider random variables $X \in \mathbb{R}^{d_X}$ and $Y \in \mathbb{R}^{d_Y}$, given $Z \in \mathbb{R}^{d_Z}$. The Shannon conditional mutual information is defined as

\[\begin{align*} I(X; Y | Z) &= H^S(X, Z) + H^S(Y, z) - H^S(X, Y, Z) - H^S(Z) \\ &= I^S(X; Y, Z) + I^S(X; Y) \end{align*},\]

where $I^S(\cdot; \cdot)$ is the Shannon mutual information MIShannon, and $H^S(\cdot)$ is the Shannon entropy.

Differential Shannon CMI is obtained by replacing the entropies by differential entropies.

Conditional mutual information (Rényi, Jizba)

CausalityTools.CMIRenyiJizba — Type

CMIRenyiJizba <: ConditionalMutualInformation

The Rényi conditional mutual information $I_q^{R_{J}}(X; Y | Z$ defined in Jizba et al. (2012).

Usage

Use with independence to perform a formal hypothesis test for pairwise dependence.
Use with condmutualinfo to compute the raw conditional mutual information.

Definition

\[I_q^{R_{J}}(X; Y | Z) = I_q^{R_{J}}(X; Y, Z) - I_q^{R_{J}}(X; Z),\]

where $I_q^{R_{J}}(X; Z)$ is the MIRenyiJizba mutual information.

source

Conditional mutual information (Rényi, Poczos)

CausalityTools.CMIRenyiPoczos — Type

CMIRenyiPoczos <: ConditionalMutualInformation

The differential Rényi conditional mutual information $I_q^{R_{P}}(X; Y | Z)$ defined in (Póczos & Schneider, 2012)^{[Póczos2012]}.

Usage

Use with independence to perform a formal hypothesis test for pairwise dependence.
Use with condmutualinfo to compute the raw conditional mutual information.

Definition

\[\begin{align*} I_q^{R_{P}}(X; Y | Z) = \dfrac{1}{q-1} \int \int \int \dfrac{p_Z(z) p_{X, Y | Z}^q}{( p_{X|Z}(x|z) p_{Y|Z}(y|z) )^{q-1}} \\ \mathbb{E}_{X, Y, Z} \sim p_{X, Y, Z} \left[ \dfrac{p_{X, Z}^{1-q}(X, Z) p_{Y, Z}^{1-q}(Y, Z) }{p_{X, Y, Z}^{1-q}(X, Y, Z) p_Z^{1-q}(Z)} \right] \end{align*}\]

source

Transfer entropy (Shannon)

CausalityTools.TEShannon — Type

TEShannon <: TransferEntropy
TEShannon(; base = 2; embedding = EmbeddingTE()) <: TransferEntropy

The Shannon-type transfer entropy measure.

Usage

Use with independence to perform a formal hypothesis test for pairwise and conditional dependence.
Use with transferentropy to compute the raw transfer entropy.

Description

The transfer entropy from source $S$ to target $T$, potentially conditioned on $C$ is defined as

\[\begin{align*} TE(S \to T) &:= I^S(T^+; S^- | T^-) \\ TE(S \to T | C) &:= I^S(T^+; S^- | T^-, C^-) \end{align*}\]

where $I(T^+; S^- | T^-)$ is the Shannon conditional mutual information (CMIShannon). The variables $T^+$, $T^-$, $S^-$ and $C^-$ are described in the docstring for transferentropy.

Compatible estimators

Shannon-type transfer entropy can be estimated using a range of different estimators, which all boil down to computing conditional mutual information, except for TransferEntropyEstimator, which compute transfer entropy using some direct method.

Estimator	Type	Principle	`TEShannon`
`CountOccurrences`	`ProbabilitiesEstimator`	Frequencies	✓
`ValueHistogram`	`ProbabilitiesEstimator`	Binning (histogram)	✓
`SymbolicPermuation`	`ProbabilitiesEstimator`	Ordinal patterns	✓
`Dispersion`	`ProbabilitiesEstimator`	Dispersion patterns	✓
`Kraskov`	`DifferentialEntropyEstimator`	Nearest neighbors	✓
`Zhu`	`DifferentialEntropyEstimator`	Nearest neighbors	✓
`ZhuSingh`	`DifferentialEntropyEstimator`	Nearest neighbors	✓
`Gao`	`DifferentialEntropyEstimator`	Nearest neighbors	✓
`Goria`	`DifferentialEntropyEstimator`	Nearest neighbors	✓
`Lord`	`DifferentialEntropyEstimator`	Nearest neighbors	✓
`LeonenkoProzantoSavani`	`DifferentialEntropyEstimator`	Nearest neighbors	✓
`GaussanMI`	`MutualInformationEstimator`	Parametric	✓
`KSG1`	`MutualInformationEstimator`	Continuous	✓
`KSG2`	`MutualInformationEstimator`	Continuous	✓
`GaoKannanOhViswanath`	`MutualInformationEstimator`	Mixed	✓
`GaoOhViswanath`	`MutualInformationEstimator`	Continuous	✓
`FPVP`	`ConditionalMutualInformationEstimator`	Nearest neighbors	✓
`MesnerShalizi`	`ConditionalMutualInformationEstimator`	Nearest neighbors	✓
`Rahimzamani`	`ConditionalMutualInformationEstimator`	Nearest neighbors	✓
`Zhu1`	`TransferEntropyEstimator`	Nearest neighbors	✓
`Lindner`	`TransferEntropyEstimator`	Nearest neighbors	✓

source

Transfer entropy (Rényi, Jizba)

CausalityTools.TERenyiJizba — Type

TERenyiJizba() <: TransferEntropy

The Rényi transfer entropy from Jizba et al. (2012).

Usage

Use with independence to perform a formal hypothesis test for pairwise and conditional dependence.
Use with transferentropy to compute the raw transfer entropy.

Description

The transfer entropy from source $S$ to target $T$, potentially conditioned on $C$ is defined as

\[\begin{align*} TE(S \to T) &:= I_q^{R_J}(T^+; S^- | T^-) \\ TE(S \to T | C) &:= I_q^{R_J}(T^+; S^- | T^-, C^-), \end{align*},\]

where $I_q^{R_J}(T^+; S^- | T^-)$ is Jizba et al. (2012)'s definition of conditional mutual information (CMIRenyiJizba). The variables $T^+$, $T^-$, $S^-$ and $C^-$ are described in the docstring for transferentropy.

Compatible estimators

Jizba's formulation of Renyi-type transfer entropy can currently be estimated using selected probabilities estimators and differential entropy estimators, which under the hood compute the transfer entropy as Jizba's formulation of Rényi conditional mutual information.

Estimator	Type	Principle	`TERenyiJizba`
`CountOccurrences`	`ProbabilitiesEstimator`	Frequencies	✓
`ValueHistogram`	`ProbabilitiesEstimator`	Binning (histogram)	✓
`LeonenkoProzantoSavani`	`DifferentialEntropyEstimator`	Nearest neighbors	✓

source

Part mutual information

CausalityTools.PMI — Type

PMI <: AssociationMeasure
PMI(; base = 2)

The partial mutual information (PMI) measure of association (Zhao et al., 2016).

Definition

PMI is defined as for variables $X$, $Y$ and $Z$ as

\[PMI(X; Y | Z) = D(p(x, y, z) || p^{*}(x|z) p^{*}(y|z) p(z)),\]

where $p(x, y, z)$ is the joint distribution for $X$, $Y$ and $Z$, and $D(\cdot, \cdot)$ is the extended Kullback-Leibler divergence from $p(x, y, z)$ to $p^{*}(x|z) p^{*}(y|z) p(z)$. See Zhao et al. (2016) for details.

Estimation

PMI can be estimated using any ProbabilitiesEstimator that implements marginal_encodings. This allows estimation of 3D contingency matrices, from which relevant probabilities for the PMI formula are extracted. See also pmi.

Properties

For the discrete case, the following identities hold in theory (when estimating PMI, they may not).

PMI(X, Y, Z) >= CMI(X, Y, Z) (where CMI is the Shannon CMI). Holds in theory, but when estimating PMI, the identity may not hold.
PMI(X, Y, Z) >= 0. Holds both in theory and for estimation using ProbabilitiesEstimators.
X ⫫ Y | Z => PMI(X, Y, Z) = CMI(X, Y, Z) = 0 (in theory, but not necessarily for estimation).

source

Predictive asymmetry

CausalityTools.PA — Type

PA <: CausalityTools.AssociationMeasure
PA(ηT = 1:5, τS = 1, τC = 1)

The modified predictive asymmetry measure (Haaga et al., in revision).

Note

This is an experimental measure. It is part of an ongoing paper submission revision, but is provided here for convenience.

Usage

Use with independence to perform a formal hypothesis test for pairwise or conditional directional dependence.
Use with asymmetry to compute the raw asymmetry distribution.

Keyword arguments

ηT. The prediction lags for the target variable.
τS. The embedding delay(s) for the source variable.
τC. The embedding delay(s) for the conditional variable(s).

All parameters are given as a single integer or multiple integers greater than zero.

Compatible estimators

PA/asymmetry uses condmutualinfo under the hood. Any estimator that can be used for ConditionalMutualInformation can therefore, in principle, be used with the predictive asymmetry. We recommend to use FPVP, or one of the other dedicated conditional mutual information estimators.

Estimator	Type	Principle	Pairwise	Conditional
`CountOccurrences`	`ProbabilitiesEstimator`	Frequencies	✓	✓
`ValueHistogram`	`ProbabilitiesEstimator`	Binning (histogram)	✓	✓
`Dispersion`	`ProbabilitiesEstimator`	Dispersion patterns	✓	✓
`Kraskov`	`DifferentialEntropyEstimator`	Nearest neighbors	✓	✓
`Zhu`	`DifferentialEntropyEstimator`	Nearest neighbors	✓	✓
`ZhuSingh`	`DifferentialEntropyEstimator`	Nearest neighbors	✓	✓
`Gao`	`DifferentialEntropyEstimator`	Nearest neighbors	✓	✓
`Goria`	`DifferentialEntropyEstimator`	Nearest neighbors	✓	✓
`Lord`	`DifferentialEntropyEstimator`	Nearest neighbors	✓	✓
`LeonenkoProzantoSavani`	`DifferentialEntropyEstimator`	Nearest neighbors	✓	✓
`GaussanMI`	`MutualInformationEstimator`	Parametric	✓	✓
`KSG1`	`MutualInformationEstimator`	Continuous	✓	✓
`KSG2`	`MutualInformationEstimator`	Continuous	✓	✓
`GaoKannanOhViswanath`	`MutualInformationEstimator`	Mixed	✓	✓
`GaoOhViswanath`	`MutualInformationEstimator`	Continuous	✓	✓
`FPVP`	`ConditionalMutualInformationEstimator`	Nearest neighbors	✓	✓
`MesnerShalizi`	`ConditionalMutualInformationEstimator`	Nearest neighbors	✓	✓
`Rahimzamani`	`ConditionalMutualInformationEstimator`	Nearest neighbors	✓	✓

Examples

Computing the asymmetry distribution.

source

Székely2007Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
Székely2014Székely, G. J., & Rizzo, M. L. (2014). Partial distance correlation with methods for dissimilarities.
Székely2007Székely, G. J., Rizzo, M. L., & Bakirov, N. K. (2007). Measuring and testing dependence by correlation of distances. The annals of statistics, 35(6), 2769-2794.
Székely2014Székely, G. J., & Rizzo, M. L. (2014). Partial distance correlation with methods for dissimilarities.
Póczos2012Póczos, B., & Schneider, J. (2012, March). Nonparametric estimation of conditional information and divergences. In Artificial Intelligence and Statistics (pp. 914-923). PMLR.