Predictive asymmetry
Computing the asymmetry distribution
The following example demonstrates how to compute the asymmetry
distribution from time series input. We'll use timeseries from a chain of unidirectionally coupled logistic maps that are coupled $X \to Y \to Z \to W$.
These examples compute the asymmetry distribution directly. Use the PA
measure with PATest
for formal independence testing.
Pairwise analysis
When considering only two variables $V_1$ and $V_2$, we expect the distribution $\DeltaA_{X \to Y}$ to be skewed towards positive values if $V_1 \to V2$.
Parameters are tuned by providing an instance of the PA
measure, which quantifies directional influence. We'll use the FPVP
estimator, and compute the asymmetry distribution over prediction lags ηT = 1:10
. In real applications, it is important to ensure proper embeddings for the source (and conditional, if relevant) variables. We will optimize embedding parameters using the "traditional" approach from DelayEmbeddings.jl.
using CausalityTools
using DelayEmbeddings
using Random
rng = MersenneTwister(1234)
sys = system(Logistic4Chain(xi = [0.1, 0.2, 0.3, 0.4]; rng))
x, y, z, w = columns(first(trajectory(sys, 1000)))
τx = estimate_delay(x, "mi_min")
τy = estimate_delay(y, "mi_min")
est = FPVP(; k = 3, w = 5)
ΔA_xy = asymmetry(PA(ηT = 1:10, τS = τx), est, x, y)
ΔA_yx = asymmetry(PA(ηT = 1:10, τS = τy), est, y, x)
ΔA_xy, ΔA_yx
([0.37964078637756926, 0.5700010587023704, 0.4423661961050572, 0.3251390525739378, 0.22043583037467318, 0.11046402456257488, 0.035387342936977206, 0.08374517352065561, 0.045674645931783886, 0.003536500330831499], [-0.22264646714021946, -0.3785222647748956, -0.5333594832344022, -0.5718418632072824, -0.5081584252884588, -0.34472974311472615, -0.23009305716682052, -0.15795192261381658, -0.1023788425564762, -0.042614661166228])
As expected, since there is coupling $X \to Y$, $\Delta A_{X \to Y}$ is skewed towards positive values, while $\Delta A_{Y \to X}$ is skewed towards negative values because there is no coupling $Y \to X$.
Conditional analysis
What happens if we compute$\Delta A_{X \to Z}$? We'd maybe expect there to be some information transfer $X \to Z$, even though ther are not directly linked, because information is transferred through $Y$.
ΔA_xz = asymmetry(PA(ηT = 1:10, τS = estimate_delay(x, "mi_min")), est, x, z)
10-element Vector{Float64}:
0.21424716981602568
0.27851545180713033
0.4015997961187442
0.4675500347030335
0.3920970839275831
0.28389260252099824
0.21340532151364086
0.11700067174106453
0.03304355139404827
-0.005941821256717633
As expected, the distribution is still skewed towards positive values. To determine whether the information flow between $x$ and $z$ is mediated by $y$, we can compute the conditional distribution $\Delta A_{X \to Z | Y}$. If these values are still positively skewed, we conclude that $Y$ is not a mediating variable. If conditioning on $Y$ causes $\Delta A_{X \to Z | Y}$ to not be skewed towards positive values any more, then we conclude that $Y$ is a mediating variable and that $X$ and $Z$ are linked $X \to Y \to Z$.
In these examples, the same time series are formally tested for independence using a PATest
.