Cross mappings
ConvergentCrossMapping
ConvergentCrossMapping
directly
using CausalityTools
x, y = rand(200), rand(100)
crossmap(CCM(), x, y)
0.07726812476523078
ConvergentCrossMapping
with RandomVectors
When cross-mapping with the RandomVectors
estimator, a single random subsample of time indices (i.e. not in any particular order) of length l
is drawn for each library size l
, and cross mapping is performed using the embedding vectors corresponding to those time indices.
using CausalityTools
using Random; rng = MersenneTwister(1234)
x, y = randn(rng, 200), randn(rng, 200)
# We'll draw a single sample at each `l ∈ libsizes`. Sampling with replacement is then
# necessary, because our 200-pt timeseries will result in embeddings with
# less than 200 points.
est = RandomVectors(; libsizes = 50:10:200, replace = true, rng)
crossmap(CCM(), est, x, y)
16-element Vector{Float64}:
0.21088660153904898
0.3467362625807074
0.3933779845722765
0.1757701524286313
0.3229219861118108
0.6678028489242923
0.37416726533844197
0.49674861217692967
0.6964097521592107
0.7380731852924826
0.6120662778037197
0.5835420327792307
0.7437069713103734
0.6959511740173849
0.7250642304674656
0.6679062393303731
To generate a distribution of cross-map estimates for each l ∈ libsizes
, just call crossmap repeatedly, e.g.
using CausalityTools
using Random; rng = MersenneTwister(1234)
x, y = randn(rng, 200), randn(rng, 200)
est = RandomVectors(; libsizes = 50:10:200, replace = true, rng)
ρs = [crossmap(CCM(), est, x, y) for i = 1:80]
M = hcat(ρs...)
16×80 Matrix{Float64}:
0.210887 0.283772 0.428038 … 0.198969 0.309732 0.227941
0.346736 0.203367 0.0932958 0.215941 0.316676 -0.00783894
0.393378 0.391621 -0.00551755 0.538569 0.0652539 0.330635
0.17577 0.491756 0.430232 0.470485 0.468347 0.368178
0.322922 0.464708 0.207402 0.363501 0.413348 0.387486
0.667803 0.374655 0.810103 … 0.258513 0.328743 0.410508
0.374167 0.627199 0.406356 0.451363 0.451695 0.546689
0.496749 0.526126 0.561979 0.441477 0.364676 0.586323
0.69641 0.632972 0.526645 0.667022 0.554052 0.508119
0.738073 0.555032 0.459479 0.671852 0.551254 0.619701
0.612066 0.612132 0.537864 … 0.528112 0.617958 0.693744
0.583542 0.59571 0.575687 0.685094 0.613155 0.715312
0.743707 0.754096 0.596518 0.751794 0.49696 0.535689
0.695951 0.557804 0.69663 0.648556 0.676511 0.645601
0.725064 0.641545 0.662165 0.736896 0.608261 0.618558
0.667906 0.69464 0.736842 … 0.668381 0.754373 0.680733
Now, the k
-th row of M
contains 80
estimates of the correspondence measure ρ
at library size libsizes[k]
.
ConvergentCrossMapping
with RandomSegments
When cross-mapping with the RandomSegments
estimator, a single random subsample of continguous, ordered time indices of length l
is drawn for each library size l
, and cross mapping is performed using the embedding vectors corresponding to those time indices.
using CausalityTools
using Random; rng = MersenneTwister(1234)
x, y = randn(rng, 200), randn(rng, 200)
# We'll draw a single sample at each `l ∈ libsizes`. We limit the library size to 100,
# because drawing segments of the data longer than half the available data doesn't make
# much sense.
est = RandomSegment(; libsizes = 50:10:100, rng)
crossmap(CCM(), est, x, y)
6-element Vector{Float64}:
-0.2634055804365916
-0.14349152303626025
-0.0526762102616005
0.043365740380027676
0.05488596053242309
0.07760231285007649
As above, to generate a distribution of cross-map estimates for each l ∈ libsizes
, just call crossmap repeatedly, e.g.
using CausalityTools
using Random; rng = MersenneTwister(1234)
x, y = randn(rng, 200), randn(rng, 200)
est = RandomSegment(; libsizes = 50:10:100, rng)
ρs = [crossmap(CCM(), est, x, y) for i = 1:80]
M = hcat(ρs...)
6×80 Matrix{Float64}:
-0.263406 -0.158317 -0.0465246 … 0.165561 0.102394 -0.00505766
-0.143492 0.00585041 -0.0648202 0.0911755 -0.0759609 0.132726
-0.0526762 0.155563 0.00744757 0.0786805 -0.170245 -0.306462
0.0433657 -0.0217983 0.221699 0.0664294 0.0511184 0.0196313
0.054886 0.0292453 0.0933321 0.102349 0.208206 0.0662005
0.0776023 0.0549983 0.0379794 … 0.0814226 0.0771904 -0.00794114
Now, the k
-th row of M
contains 80
estimates of the correspondence measure ρ
at library size libsizes[k]
.
Reproducing Sugihara et al. (2012)
If copying these examples and running them locally, make sure the relevant packages (given in the first block) are loaded first.
Figure 3A
Let's reproduce figure 3A too, focusing only on ConvergentCrossMapping
this time. In this figure, they compute the cross mapping for libraries of increasing size, always starting at time index 1. This approach - which we here call the ExpandingSegment
estimator - is one of many ways of estimating the correspondence between observed and predicted value.
For this example, they use a bidirectional system with asymmetrical coupling strength.
using CausalityTools
using Statistics
using LabelledArrays
using StaticArrays
using DynamicalSystemsBase
using StateSpaceSets
using CairoMakie, Printf
function eom_logistic_sugi(u, p, t)
(; rx, ry, βxy, βyx) = p
(; x, y) = u
dx = x*(rx - rx*x - βxy*y)
dy = y*(ry - ry*y - βyx*x)
return SVector{2}(dx, dy)
end
# βxy := effect on x of y
# βyx := effect on y of x
function logistic_sugi(; u0 = rand(2), rx, ry, βxy, βyx)
p = @LArray [rx, ry, βxy, βyx] (:rx, :ry, :βxy, :βyx)
DiscreteDynamicalSystem(eom_logistic_sugi, u0, p)
end
# Used in `reproduce_figure_3A_naive`, and `reproduce_figure_3A_ensemble` below.
function add_to_fig!(fig_pos, libsizes, ρs_x̂y, ρs_ŷx; title = "", quantiles = false)
ax = Axis(fig_pos; title, aspect = 1,
xlabel = "Library size", ylabel = "Correlation (ρ)")
ylims!(ax, (-1, 1))
hlines!([0], linestyle = :dash, alpha = 0.5, color = :grey)
scatterlines!(libsizes, median.(ρs_x̂y), label = "x̂|y", color = :blue)
scatterlines!(libsizes, median.(ρs_ŷx), label = "ŷ|x", color = :red)
if quantiles
band!(libsizes, quantile.(ρs_x̂y, 0.05), quantile.(ρs_x̂y, 0.95), color = (:blue, 0.5))
band!(libsizes, quantile.(ρs_ŷx, 0.05), quantile.(ρs_ŷx, 0.95), color = (:red, 0.5))
end
axislegend(ax, position = :rb)
end
function reproduce_figure_3A_naive(measure::CrossmapMeasure)
sys_bidir = logistic_sugi(; u0 = [0.2, 0.4], rx = 3.7, ry = 3.700001, βxy = 0.02, βyx = 0.32);
x, y = columns(first(trajectory(sys_bidir, 3100, Ttr = 10000)));
libsizes = [20:2:50; 55:5:200; 300:50:500; 600:100:900; 1000:500:3000]
est = ExpandingSegment(; libsizes);
ρs_x̂y = crossmap(measure, est, x, y)
ρs_ŷx = crossmap(measure, est, y, x)
with_theme(theme_minimal(),
markersize = 5) do
fig = Figure(resolution = (800, 300))
add_to_fig!(fig[1, 1], libsizes, ρs_x̂y, ρs_ŷx; title = "`ExpandingSegment`")
fig
end
end
reproduce_figure_3A_naive(ConvergentCrossMapping(d = 3))
Hm. This looks a bit like the paper, but the curve is not smooth. We can do better!
It is not clear from the paper exactly what they plot in their Figure 3A, if they plot an average of some kind, or precisely what parameters and initial conditions they use. However, we can get a smoother plot by using a Ensemble
. Combined with a CrossmapEstimator
, it uses Monte Carlo resampling on subsets of the input data to compute an ensemble of ρ
s that we here use to compute the median and 90-th percentile range for each library size.
function reproduce_figure_3A_ensemble(measure::CrossmapMeasure)
sys_bidir = logistic_sugi(; u0 = [0.4, 0.2], rx = 3.8, ry = 3.5, βxy = 0.02, βyx = 0.1);
x, y = columns(first(trajectory(sys_bidir, 10000, Ttr = 10000)));
# Note: our time series are 1000 points long. When embedding, some points are
# lost, so we must use slightly less points for the segments than
# there are points in the original time series.
libsizes = [20:5:50; 55:5:200; 300:50:500; 600:100:900; 1000:500:3000]
# No point in doing more than one rep, because there data are always the same
# for `ExpandingSegment.`
ensemble_ev = Ensemble(measure, ExpandingSegment(; libsizes); nreps = 1)
ensemble_rs = Ensemble(measure, RandomSegment(; libsizes); nreps = 30)
ensemble_rv = Ensemble(measure, RandomVectors(; libsizes); nreps = 30)
ρs_x̂y_es = crossmap(ensemble_ev, x, y)
ρs_ŷx_es = crossmap(ensemble_ev, y, x)
ρs_x̂y_rs = crossmap(ensemble_rs, x, y)
ρs_ŷx_rs = crossmap(ensemble_rs, y, x)
ρs_x̂y_rv = crossmap(ensemble_rv, x, y)
ρs_ŷx_rv = crossmap(ensemble_rv, y, x)
with_theme(theme_minimal(),
markersize = 5) do
fig = Figure(resolution = (800, 300))
add_to_fig!(fig[1, 1], libsizes, ρs_x̂y_es, ρs_ŷx_es; title = "`ExpandingSegment`", quantiles = false) # quantiles make no sense for `ExpandingSegment`
add_to_fig!(fig[1, 2], libsizes, ρs_x̂y_rs, ρs_ŷx_rs; title = "`RandomSegment`", quantiles = true)
add_to_fig!(fig[1, 3], libsizes, ρs_x̂y_rv, ρs_ŷx_rv; title = "`RandomVector`", quantiles = true)
fig
end
end
reproduce_figure_3A_ensemble(ConvergentCrossMapping(d = 3, τ = -1))
With the RandomVectors
estimator, the mean of our ensemble ρ
s seem to look pretty much identical to Figure 3A in Sugihara et al. The RandomSegment
estimator also performs pretty well, but since subsampled segments are contiguous, there are probably some autocorrelation effects at play.
We can avoid the autocorrelation issue by tuning the w
parameter of the ConvergentCrossMapping
measure, which is the Theiler window. Setting the Theiler window to w > 0
, we can exclude neighbors of a query point p
that are close to p
in time, and thus deal with autocorrelation issues that way (the default w = 0
excludes only the point itself). Let's re-do the analysis with w = 5
, just for fun.
reproduce_figure_3A_ensemble(ConvergentCrossMapping(d = 3, τ = -1, w = 5))
There wasn't really that much of a difference, since for the logistic map, the autocorrelation function flips sign for every lag increase. However, for examples from other systems, tuning w
may be important.
Figure 3B
What about figure 3B? Here they generate time series of length 400 for a range of values for both coupling parameters, and plot the dominant direction $\Delta = \rho(\hat{x} | y) - \rho(\hat{y} | x)$.
In the paper, they use a 1000 different parameterizations for the logistic map parameters, but don't state what is summarized in the plot. For simplicity, we'll therefore just stick to rx = ry = 3.7
, as in the examples above, and just loop over the coupling strengths in either direction.
function reproduce_figure_3B()
βxys = 0.0:0.02:0.4
βyxs = 0.0:0.02:0.4
ρx̂ys = zeros(length(βxys), length(βyxs))
ρŷxs = zeros(length(βxys), length(βyxs))
for (i, βxy) in enumerate(βxys)
for (j, βyx) in enumerate(βyxs)
sys_bidir = logistic_sugi(; u0 = [0.2, 0.4], rx = 3.7, ry = 3.7, βxy, βyx);
# Generate 1000 points. Randomly select a 400-pt long segment.
x, y = columns(first(trajectory(sys_bidir, 400, Ttr = 10000)));
ensemble = Ensemble(CCM(d = 3, w = 5, τ = -1), RandomVectors(libsizes = 100), nreps = 50)
ρx̂ys[i, j] = mean(crossmap(ensemble, x, y))
ρŷxs[i, j] = mean(crossmap(ensemble, y, x))
end
end
Δ = ρŷxs .- ρx̂ys
with_theme(theme_minimal(),
markersize = 5) do
fig = Figure();
ax = Axis(fig[1, 1], xlabel = "βxy", ylabel = "βyx")
cont = contourf!(ax, Δ, levels = range(-1, 1, length = 10),
colormap = :curl)
ax.xticks = 1:length(βxys), string.([i % 2 == 0 ? βxys[i] : "" for i in 1:length(βxys)])
ax.yticks = 1:length(βyxs), string.([i % 2 == 0 ? βyxs[i] : "" for i in 1:length(βyxs)])
Colorbar(fig[1 ,2], cont, label = "Δ (ρ(ŷ|x) - ρ(x̂|y))")
tightlimits!(ax)
fig
end
end
reproduce_figure_3B()
Figures 3C and 3D
Let's reproduce figures 3C and 3D in Sugihara et al. (2012)[Sugihara2012], which introduced the ConvergentCrossMapping
measure. Equations and parameters can be found in their supplementary material. Simulatenously, we also compute the PairwiseAsymmetricInference
measure from McCracken & Weigel (2014)[McCracken2014], which is a related method, but uses a slightly different embedding.
using CausalityTools
using Statistics
using LabelledArrays
using StaticArrays
using DynamicalSystemsBase
using StateSpaceSets
using CairoMakie, Printf
# -----------------------------------------------------------------------------------------
# Create 500-point long time series for Sugihara et al. (2012)'s example for figure 3.
# -----------------------------------------------------------------------------------------
sys_unidir = logistic_sugi(; u0 = [0.2, 0.4], rx = 3.7, ry = 3.700001, βxy = 0.00, βyx = 0.32);
x, y = columns(first(trajectory(sys_unidir, 500, Ttr = 10000)));
# -----------------------------------------------------------------------------------------
# Cross map.
# -----------------------------------------------------------------------------------------
m_ccm = ConvergentCrossMapping(d = 2)
m_pai = PairwiseAsymmetricInference(d = 2)
# Make predictions x̂y, i.e. predictions `x̂` made from embedding of y (AND x, if PAI)
t̂ccm_x̂y, tccm_x̂y, ρccm_x̂y = predict(m_ccm, x, y)
t̂pai_x̂y, tpai_x̂y, ρpai_x̂y = predict(m_pai, x, y);
# Make predictions ŷx, i.e. predictions `ŷ` made from embedding of x (AND y, if PAI)
t̂ccm_ŷx, tccm_ŷx, ρccm_ŷx = predict(m_ccm, y, x)
t̂pai_ŷx, tpai_ŷx, ρpai_ŷx = predict(m_pai, y, x);
# -----------------------------------------------------------------------------------------
# Plot results
# -----------------------------------------------------------------------------------------
ρs = (ρccm_x̂y, ρpai_x̂y, ρccm_ŷx, ρpai_ŷx)
sccm_x̂y, spai_x̂y, sccm_ŷx, spai_ŷx = (map(ρ -> (@sprintf "%.3f" ρ), ρs)...,)
ρs = (ρccm_x̂y, ρpai_x̂y, ρccm_ŷx, ρpai_ŷx)
sccm_x̂y, spai_x̂y, sccm_ŷx, spai_ŷx = (map(ρ -> (@sprintf "%.3f" ρ), ρs)...,)
with_theme(theme_minimal(),
markersize = 5) do
fig = Figure();
ax_ŷx = Axis(fig[2,1], aspect = 1, xlabel = "y(t) (observed)", ylabel = "ŷ(t) | x (predicted)")
ax_x̂y = Axis(fig[2,2], aspect = 1, xlabel = "x(t) (observed)", ylabel = "x̂(t) | y (predicted)")
xlims!(ax_ŷx, (0, 1)), ylims!(ax_ŷx, (0, 1))
xlims!(ax_x̂y, (0, 1)), ylims!(ax_x̂y, (0, 1))
ax_ts = Axis(fig[1, 1:2], xlabel = "Time (t)", ylabel = "Value")
scatterlines!(ax_ts, x[1:300], label = "x")
scatterlines!(ax_ts, y[1:300], label = "y")
axislegend()
scatter!(ax_ŷx, tccm_ŷx, t̂ccm_ŷx, label = "CCM (ρ = $sccm_ŷx)", color = :black)
scatter!(ax_ŷx, tpai_ŷx, t̂pai_ŷx, label = "PAI (ρ = $spai_ŷx)", color = :red)
axislegend(ax_ŷx, position = :lt)
scatter!(ax_x̂y, tccm_x̂y, t̂ccm_x̂y, label = "CCM (ρ = $sccm_x̂y)", color = :black)
scatter!(ax_x̂y, tpai_x̂y, t̂pai_x̂y, label = "PAI (ρ = $spai_x̂y)", color = :red)
axislegend(ax_x̂y, position = :lt)
fig
end
PairwiseAsymmetricInference
Reproducing McCracken & Weigel (2014)
Let's try to reproduce figure 8 from McCracken & Weigel (2014)'s[McCracken2014] paper on PairwiseAsymmetricInference
(PAI). We'll start by defining the their example B (equations 6-7). This system consists of two variables $X$ and $Y$, where $X$ drives $Y$.
After we have computed the PAI in both directions, we define a measure of directionality as the difference between PAI in the $X \to Y$ direction and in the $Y \to X$ direction, so that if $X$ drives $Y$, then $\Delta < 0$.
using CausalityTools
using LabelledArrays
using StaticArrays
using DynamicalSystemsBase
using StateSpaceSets
using CairoMakie, Printf
using Distributions: Normal
using Statistics: mean, std
function eom_nonlinear_sindriver(dx, x, p, n)
a, b, c, t, Δt = (p...,)
x, y = x[1], x[2]
𝒩 = Normal(0, 1)
dx[1] = sin(t)
dx[2] = a*x * (1 - b*x) + c* rand(𝒩)
p[end-1] += 1 # update t
return
end
function nonlinear_sindriver(;u₀ = rand(2), a = 1.0, b = 1.0, c = 2.0, Δt = 1)
DiscreteDynamicalSystem(eom_nonlinear_sindriver, u₀, [a, b, c, 0, Δt])
end
function reproduce_figure_8_mccraken(;
c = 2.0, Δt = 0.2,
as = 0.25:0.25:5.0,
bs = 0.25:0.25:5.0)
# -----------------------------------------------------------------------------------------
# Generate many time series for many different values of the parameters `a` and `b`,
# and compute PAI. This will replicate the upper right panel of
# figure 8 in McCracken & Weigel (2014).
# -----------------------------------------------------------------------------------------
measure = PairwiseAsymmetricInference(d = 3)
# Manually resample `nreps` length-`L` time series and use mean ρ(x̂|X̄y) - ρ(ŷ|Ȳx)
# for each parameter combination.
nreps = 50
L = 300 # length of timeseries
Δ = zeros(length(as), length(bs))
for (i, a) in enumerate(as)
for (j, b) in enumerate(bs)
s = nonlinear_sindriver(; a, b, c, Δt)
x, y = columns(first(trajectory(s, 1000, Ttr = 10000)))
Δreps = zeros(nreps)
for i = 1:nreps
# Ensure we're subsampling at the same time indices.
ind_start = rand(1:(1000-L))
r = ind_start:(ind_start + L)
Δreps[i] = @views crossmap(measure, y[r], x[r]) -
crossmap(measure, x[r], y[r])
end
Δ[i, j] = mean(Δreps)
end
end
# -----------------------------------------------------------------------------------------
# An example time series for plotting.
# -----------------------------------------------------------------------------------------
sys = nonlinear_sindriver(; a = 1.0, b = 1.0, c, Δt)
npts = 500
orbit = first(trajectory(sys, npts, Ttr = 10000))
x, y = columns(orbit)
with_theme(theme_minimal(),
markersize = 5) do
X = x[1:300]
Y = y[1:300]
fig = Figure();
ax_ts = Axis(fig[1, 1:2], xlabel = "Time (t)", ylabel = "Value")
scatterlines!(ax_ts, (X .- mean(X)) ./ std(X), label = "x")
scatterlines!(ax_ts, (Y .- mean(Y)) ./ std(Y), label = "y")
axislegend()
ax_hm = Axis(fig[2, 1:2], xlabel = "a", ylabel = "b")
ax_hm.yticks = (1:length(as), string.([i % 2 == 0 ? as[i] : "" for i = 1:length(as)]))
ax_hm.xticks = (1:length(bs), string.([i % 2 == 0 ? bs[i] : "" for i = 1:length(bs)]))
hm = heatmap!(ax_hm, Δ, colormap = :viridis)
Colorbar(fig[2, 3], hm; label = "Δ' = ρ(ŷ | yx) - ρ(x̂ | xy)")
fig
end
end
reproduce_figure_8_mccraken()
As expected, $\Delta < 0$ for all parameter combinations, implying that $X$ "PAI drives" $Y$.
- Sugihara2012Sugihara, G., May, R., Ye, H., Hsieh, C. H., Deyle, E., Fogarty, M., & Munch, S. (2012). Detecting causality in complex ecosystems. science, 338(6106), 496-500.
- McCracken2014McCracken, J. M., & Weigel, R. S. (2014). Convergent cross-mapping and pairwise asymmetric inference. Physical Review E, 90(6), 062903.