Cross-map API
Estimators
Associations.CrossmapEstimator
— TypeCrossmapEstimator{M<:CrossmapMeasure, LIBSIZES, RNG}
The abstract supertype for all cross-map estimators.
Concrete subtypes
Description
Because the type of the library may differ between estimators, and because RNGs from different packages may be used, subtypes must implement the LIBSIZES
and RNG
type parameters.
For efficiency purposes, subtypes may contain mutable containers that can be re-used for ensemble analysis (see Ensemble
).
A cross-map estimator uses the concept of "libraries". A library is essentially just a reference to a set of points, and usually, a library refers to indices of points, not the actual points themselves.
For example, for timeseries, RandomVectors(libsizes = 50:25:100)
produces three separate libraries, where the first contains 50 randomly selected time indices, the second contains 75 randomly selected time indices, and the third contains 100 randomly selected time indices. This of course assumes that all quantities involved can be indexed using the same time indices, meaning that the concept of "library" only makes sense after relevant quantities have been jointly embedded, so that they can be jointly indexed. For non-instantaneous prediction, the maximum possible library size shrinks with the magnitude of the index/time-offset for the prediction.
For spatial analyses (not yet implemented), indices could be more complex and involve multi-indices.
Associations.RandomVectors
— TypeRandomVectors <: CrossmapEstimator
RandomVectors(definition::CrossmapMeasure; libsizes, replace = false,
rng = Random.default_rng())
Cross map once over N = length(libsizes)
different "point libraries", where point indices are selected randomly (not considering time ordering).
This is method 3 from Luo et al. (2015). See CrossmapEstimator
for an in-depth explanation of what "library" means in this context.
Description
The cardinality of the point libraries are given by libsizes
. One set of random point indices is selected per L ∈ libsizes
, and the i
-th library has cardinality k = libsizes[i]
.
Point indices within each library are randomly selected, independently of other libraries. A user-specified rng
may be specified for reproducibility. The replace
argument controls whether sampling is done with or without replacement. If the time series you're cross mapping between have length M
, and Lᵢ < M
for any Lᵢ ∈ libsizes
, then you must set replace = true
.
Returns
The return type when used with association
depends on the type of libsizes
.
- If
libsizes
is anInt
(a single library), then a single cross-map estimate is returned. - If
libsizes
is anAbstractVector{Int}
(multiple libraries), then a vector of cross-map estimates is returned –- one per library.
See also: CrossmapEstimator
.
Associations.RandomSegment
— TypeRandomSegment <: CrossmapEstimator
RandomSegment(definition::CrossmapMeasure; libsizes::Int, rng = Random.default_rng())
Cross map once over N = length(libsizes)
different "point libraries", where point indices are selected as time-contiguous segments with random starting points.
This is method 2 from Luo et al. (2015). See CrossmapEstimator
for an in-depth explanation of what "library" means in this context.
Description
The cardinality of the point index segments are given by libsizes
. One segment with a randomly selected starting point is picked per L ∈ libsizes
, and the i
-th point index segment has cardinality k = libsizes[i]
.
The starting point for each library is selected independently of other libraries. A user-specified rng
may be specified for reproducibility. If the time series you're cross mapping between have length M
, and Lᵢ < M
for any Lᵢ ∈ libsizes
, then an error will be thrown.
A user-specified rng
may be specified for reproducibility.
Returns
The return type when used with association
depends on the type of libsizes
.
- If
libsizes
is anInt
(a single library), then a single cross-map estimate is returned. - If
libsizes
is anAbstractVector{Int}
(multiple libraries), then a vector of cross-map estimates is returned –- one per library.
See also: CrossmapEstimator
.
Associations.ExpandingSegment
— TypeExpandingSegment <: CrossmapEstimator
ExpandingSegment(definition::CrossmapMeasure; libsizes, rng = Random.default_rng())
Cross map once over N = length(libsizes)
different "point libraries", where point indices are selected as time-contiguous segments/windows.
This is the method from (Sugihara et al., 2012). See CrossmapEstimator
for an in-depth explanation of what "library" means in this context.
Description
Point index segments are selected as first available data point index, up to the L
th data point index. This results in one library of contiguous time indices per L ∈ libsizes
.
If used in an ensemble setting, the estimator is applied to time indices Lmin:step:Lmax
of the joint embedding.
Returns
The return type when used with association
depends on the type of libsizes
.
- If
libsizes
is anInt
(a single library), then a single cross-map estimate is returned. - If
libsizes
is anAbstractVector{Int}
(multiple libraries), then a vector of cross-map estimates is returned –- one per library.
Associations.Ensemble
— TypeEnsemble(est::CrossmapEstimator{<:CrossmapMeasure}, nreps::Int = 100)
A directive to compute an ensemble analysis, where measure
(e.g. ConvergentCrossMapping
) is computed using the given estimator est
(e.g. RandomVectors
)
Examples
- Example 1: Reproducing Figure 3A from Sugihara et al. (2012).
Advanced utility methods
For most use cases, it is sufficient to provide a CrossmapEstimator
to association
to compute a cross map measure. However, in some cases it can be useful to have more fine-grained controls. We offer a few utility functions for this purpose.
These functions are used in the examples below, where we reproduce Figures 3C and 3D of Sugihara et al. (2012) and reproduce figures from McCracken and Weigel (2014).
Associations.predict
— Functionpredict(measure::CrossmapEstimator, t::AbstractVector, s::AbstractVector) → t̂ₛ, t̄, ρ
predict(measure::CrossmapEstimator, t̄::AbstractVector, S̄::AbstractStateSpaceSet) → t̂ₛ
Perform point-wise cross mappings between source embeddings and target time series according to the algorithm specified by the given cross-map measure
(e.g. ConvergentCrossMapping
or PairwiseAsymmetricInference
).
- First method: Jointly embeds the target
t
and sources
time series (according tomeasure
) to obtain time-index aligned target timeseriest̄
and source embeddingS̄
(which is now aStateSpaceSet
). Then callspredict(measure, t̄, S̄)
(the first method), and returns both the predictionst̂ₛ
, observationst̄
and their correspondenceρ
according tomeasure
. - Second method: Returns a vector of predictions
t̂ₛ
(t̂ₛ
:= "predictions oft̄
based on source embeddingS̄
"), wheret̂ₛ[i]
is the prediction fort̄[i]
. It assumes pre-embedded data which have been correctly time-aligned using a joint embedding, i.e. such thatt̄[i]
andS̄[i]
correspond to the same time index.
Description
For each i ∈ {1, 2, …, N}
where N = length(t) == length(s)
, we make the prediction t̂[i]
(an estimate of t[i]
) based on a linear combination of D + 1
other points in t
, where the selection of points and weights for the linear combination are determined by the D+1
nearest neighbors of the point S̄[i]
. The details of point selection and weights depend on measure
.
Note: Some CrossmapMeasure
s may define more general mapping procedures. If so, the algorithm is described in their docstring.
Associations.crossmap
— Functioncrossmap(measure::CrossmapEstimator, t::AbstractVector, s::AbstractVector) → ρ::Real
crossmap(measure::CrossmapEstimator, est, t::AbstractVector, s::AbstractVector) → ρ::Vector
crossmap(measure::CrossmapEstimator, t̄::AbstractVector, S̄::AbstractStateSpaceSet) → ρ
Compute the cross map estimates between between raw time series t
and s
(and return the real-valued cross-map statistic ρ
). If a CrossmapEstimator
est
is provided, cross mapping is done on random subsamples of the data, where subsampling is dictated by est
(a vector of values for ρ
is returned).
Alternatively, cross-map between time-aligned time series t̄
and source embedding S̄
that have been constructed by jointly (pre-embedding) some input data.
This is just a wrapper around predict
that simply returns the correspondence measure between the source and the target.
Example: reproducing Sugihara et al. (2012)
If copying these examples and running them locally, make sure the relevant packages (given in the first block) are loaded first.
Figure 3A
Let's reproduce figure 3A too, focusing only on ConvergentCrossMapping
this time. In this figure, they compute the cross mapping for libraries of increasing size, always starting at time index 1. This approach - which we here call the ExpandingSegment
estimator - is one of many ways of estimating the correspondence between observed and predicted value.
For this example, they use a bidirectional system with asymmetrical coupling strength.
using Associations
using Statistics
using LabelledArrays
using StaticArrays
using DynamicalSystemsBase
using StateSpaceSets
using CairoMakie, Printf
function eom_logistic_sugi(u, p, t)
(; rx, ry, βxy, βyx) = p
(; x, y) = u
dx = x*(rx - rx*x - βxy*y)
dy = y*(ry - ry*y - βyx*x)
return SVector{2}(dx, dy)
end
# βxy := effect on x of y
# βyx := effect on y of x
function logistic_sugi(; u0 = rand(2), rx, ry, βxy, βyx)
p = @LArray [rx, ry, βxy, βyx] (:rx, :ry, :βxy, :βyx)
DiscreteDynamicalSystem(eom_logistic_sugi, u0, p)
end
# Used in `reproduce_figure_3A_naive`, and `reproduce_figure_3A_ensemble` below.
function add_to_fig!(fig_pos, libsizes, ρs_x̂y, ρs_ŷx; title = "", quantiles = false)
ax = Axis(fig_pos; title, aspect = 1,
xlabel = "Library size", ylabel = "Correlation (ρ)")
ylims!(ax, (-1, 1))
hlines!([0], linestyle = :dash, alpha = 0.5, color = :grey)
scatterlines!(libsizes, median.(ρs_x̂y), label = "x̂|y", color = :blue)
scatterlines!(libsizes, median.(ρs_ŷx), label = "ŷ|x", color = :red)
if quantiles
band!(libsizes, quantile.(ρs_x̂y, 0.05), quantile.(ρs_x̂y, 0.95), color = (:blue, 0.5))
band!(libsizes, quantile.(ρs_ŷx, 0.05), quantile.(ρs_ŷx, 0.95), color = (:red, 0.5))
end
axislegend(ax, position = :rb)
end
function reproduce_figure_3A_naive(definition::CrossmapMeasure)
sys_bidir = logistic_sugi(; u0 = [0.2, 0.4], rx = 3.7, ry = 3.700001, βxy = 0.02, βyx = 0.32);
x, y = columns(first(trajectory(sys_bidir, 3100, Ttr = 10000)));
libsizes = [20:2:50; 55:5:200; 300:50:500; 600:100:900; 1000:500:3000]
est = ExpandingSegment(definition; libsizes);
ρs_x̂y = crossmap(est, x, y)
ρs_ŷx = crossmap(est, y, x)
with_theme(theme_minimal(),
markersize = 5) do
fig = Figure(resolution = (800, 300))
add_to_fig!(fig[1, 1], libsizes, ρs_x̂y, ρs_ŷx; title = "`ExpandingSegment`")
fig
end
end
reproduce_figure_3A_naive(ConvergentCrossMapping(d = 3))
Hm. This looks a bit like the paper, but the curve is not smooth. We can do better!
It is not clear from the paper exactly what they plot in their Figure 3A, if they plot an average of some kind, or precisely what parameters and initial conditions they use. However, we can get a smoother plot by using a Ensemble
. Combined with a CrossmapEstimator
, it uses Monte Carlo resampling on subsets of the input data to compute an ensemble of ρ
s that we here use to compute the median and 90-th percentile range for each library size.
function reproduce_figure_3A_ensemble(definition::CrossmapMeasure)
sys_bidir = logistic_sugi(; u0 = [0.4, 0.2], rx = 3.8, ry = 3.5, βxy = 0.02, βyx = 0.1);
x, y = columns(first(trajectory(sys_bidir, 5000, Ttr = 10000)));
# Note: our time series are 1000 points long. When embedding, some points are
# lost, so we must use slightly less points for the segments than
# there are points in the original time series.
libsizes = [20:5:50; 55:5:200; 300:50:500; 600:100:900; 1000:500:2000]
# No point in doing more than one rep, because there data are always the same
# for `ExpandingSegment.`
ensemble_ev = Ensemble(ExpandingSegment(definition; libsizes); nreps = 1)
ensemble_rs = Ensemble(RandomSegment(definition; libsizes); nreps = 30)
ensemble_rv = Ensemble(RandomVectors(definition; libsizes); nreps = 30)
ρs_x̂y_es = crossmap(ensemble_ev, x, y)
ρs_ŷx_es = crossmap(ensemble_ev, y, x)
ρs_x̂y_rs = crossmap(ensemble_rs, x, y)
ρs_ŷx_rs = crossmap(ensemble_rs, y, x)
ρs_x̂y_rv = crossmap(ensemble_rv, x, y)
ρs_ŷx_rv = crossmap(ensemble_rv, y, x)
with_theme(theme_minimal(),
markersize = 5) do
fig = Figure(resolution = (800, 300))
add_to_fig!(fig[1, 1], libsizes, ρs_x̂y_es, ρs_ŷx_es; title = "`ExpandingSegment`", quantiles = false) # quantiles make no sense for `ExpandingSegment`
add_to_fig!(fig[1, 2], libsizes, ρs_x̂y_rs, ρs_ŷx_rs; title = "`RandomSegment`", quantiles = true)
add_to_fig!(fig[1, 3], libsizes, ρs_x̂y_rv, ρs_ŷx_rv; title = "`RandomVector`", quantiles = true)
fig
end
end
reproduce_figure_3A_ensemble(ConvergentCrossMapping(d = 3, τ = -1))
With the RandomVectors
estimator, the mean of our ensemble ρ
s seem to look pretty much identical to Figure 3A in Sugihara et al. The RandomSegment
estimator also performs pretty well, but since subsampled segments are contiguous, there are probably some autocorrelation effects at play.
We can avoid the autocorrelation issue by tuning the w
parameter of the ConvergentCrossMapping
measure, which is the Theiler window. Setting the Theiler window to w > 0
, we can exclude neighbors of a query point p
that are close to p
in time, and thus deal with autocorrelation issues that way (the default w = 0
excludes only the point itself). Let's re-do the analysis with w = 5
, just for fun.
reproduce_figure_3A_ensemble(ConvergentCrossMapping(d = 3, τ = -1, w = 5))
There wasn't really that much of a difference, since for the logistic map, the autocorrelation function flips sign for every lag increase. However, for examples from other systems, tuning w
may be important.
Figure 3B
What about figure 3B? Here they generate time series of length 400 for a range of values for both coupling parameters, and plot the dominant direction $\Delta = \rho(\hat{x} | y) - \rho(\hat{y} | x)$.
In the paper, they use a 1000 different parameterizations for the logistic map parameters, but don't state what is summarized in the plot. For simplicity, we'll therefore just stick to rx = ry = 3.7
, as in the examples above, and just loop over the coupling strengths in either direction.
function reproduce_figure_3B()
βxys = 0.0:0.02:0.4
βyxs = 0.0:0.02:0.4
ρx̂ys = zeros(length(βxys), length(βyxs))
ρŷxs = zeros(length(βxys), length(βyxs))
for (i, βxy) in enumerate(βxys)
for (j, βyx) in enumerate(βyxs)
sys_bidir = logistic_sugi(; u0 = [0.2, 0.4], rx = 3.7, ry = 3.7, βxy, βyx);
# Generate 1000 points. Randomly select a 400-pt long segment.
x, y = columns(first(trajectory(sys_bidir, 400, Ttr = 10000)));
definition = CCM(d = 3, w = 5, τ = -1)
ensemble = Ensemble(RandomVectors(definition; libsizes = 100), nreps = 50)
ρx̂ys[i, j] = mean(crossmap(ensemble, x, y))
ρŷxs[i, j] = mean(crossmap(ensemble, y, x))
end
end
Δ = ρŷxs .- ρx̂ys
with_theme(theme_minimal(),
markersize = 5) do
fig = Figure();
ax = Axis(fig[1, 1], xlabel = "βxy", ylabel = "βyx")
cont = contourf!(ax, Δ, levels = range(-1, 1, length = 10),
colormap = :curl)
ax.xticks = 1:length(βxys), string.([i % 2 == 0 ? βxys[i] : "" for i in 1:length(βxys)])
ax.yticks = 1:length(βyxs), string.([i % 2 == 0 ? βyxs[i] : "" for i in 1:length(βyxs)])
Colorbar(fig[1 ,2], cont, label = "Δ (ρ(ŷ|x) - ρ(x̂|y))")
tightlimits!(ax)
fig
end
end
reproduce_figure_3B()
Figures 3C and 3D
Let's reproduce figures 3C and 3D in Sugihara et al. (2012), which introduced the ConvergentCrossMapping
measure. Equations and parameters can be found in their supplementary material. Simulatenously, we also compute the PairwiseAsymmetricInference
measure from McCracken and Weigel (2014), which is a related method, but uses a slightly different embedding.
using Associations
using Statistics
using LabelledArrays
using StaticArrays
using DynamicalSystemsBase
using StateSpaceSets
using CairoMakie, Printf
# -----------------------------------------------------------------------------------------
# Create 500-point long time series for Sugihara et al. (2012)'s example for figure 3.
# -----------------------------------------------------------------------------------------
sys_unidir = logistic_sugi(; u0 = [0.2, 0.4], rx = 3.7, ry = 3.700001, βxy = 0.00, βyx = 0.32);
x, y = columns(first(trajectory(sys_unidir, 500, Ttr = 10000)));
# -----------------------------------------------------------------------------------------
# Cross map.
# -----------------------------------------------------------------------------------------
m_ccm = ConvergentCrossMapping(d = 2)
m_pai = PairwiseAsymmetricInference(d = 2)
# Make predictions x̂y, i.e. predictions `x̂` made from embedding of y (AND x, if PAI)
t̂ccm_x̂y, tccm_x̂y, ρccm_x̂y = predict(m_ccm, x, y)
t̂pai_x̂y, tpai_x̂y, ρpai_x̂y = predict(m_pai, x, y);
# Make predictions ŷx, i.e. predictions `ŷ` made from embedding of x (AND y, if PAI)
t̂ccm_ŷx, tccm_ŷx, ρccm_ŷx = predict(m_ccm, y, x)
t̂pai_ŷx, tpai_ŷx, ρpai_ŷx = predict(m_pai, y, x);
# -----------------------------------------------------------------------------------------
# Plot results
# -----------------------------------------------------------------------------------------
ρs = (ρccm_x̂y, ρpai_x̂y, ρccm_ŷx, ρpai_ŷx)
sccm_x̂y, spai_x̂y, sccm_ŷx, spai_ŷx = (map(ρ -> (@sprintf "%.3f" ρ), ρs)...,)
ρs = (ρccm_x̂y, ρpai_x̂y, ρccm_ŷx, ρpai_ŷx)
sccm_x̂y, spai_x̂y, sccm_ŷx, spai_ŷx = (map(ρ -> (@sprintf "%.3f" ρ), ρs)...,)
with_theme(theme_minimal(),
markersize = 5) do
fig = Figure();
ax_ŷx = Axis(fig[2,1], aspect = 1, xlabel = "y(t) (observed)", ylabel = "ŷ(t) | x (predicted)")
ax_x̂y = Axis(fig[2,2], aspect = 1, xlabel = "x(t) (observed)", ylabel = "x̂(t) | y (predicted)")
xlims!(ax_ŷx, (0, 1)), ylims!(ax_ŷx, (0, 1))
xlims!(ax_x̂y, (0, 1)), ylims!(ax_x̂y, (0, 1))
ax_ts = Axis(fig[1, 1:2], xlabel = "Time (t)", ylabel = "Value")
scatterlines!(ax_ts, x[1:300], label = "x")
scatterlines!(ax_ts, y[1:300], label = "y")
axislegend()
scatter!(ax_ŷx, tccm_ŷx, t̂ccm_ŷx, label = "CCM (ρ = $sccm_ŷx)", color = :black)
scatter!(ax_ŷx, tpai_ŷx, t̂pai_ŷx, label = "PAI (ρ = $spai_ŷx)", color = :red)
axislegend(ax_ŷx, position = :lt)
scatter!(ax_x̂y, tccm_x̂y, t̂ccm_x̂y, label = "CCM (ρ = $sccm_x̂y)", color = :black)
scatter!(ax_x̂y, tpai_x̂y, t̂pai_x̂y, label = "PAI (ρ = $spai_x̂y)", color = :red)
axislegend(ax_x̂y, position = :lt)
fig
end
Example: reproducing McCracken & Weigel (2014)
Let's try to reproduce figure 8 from McCracken and Weigel (2014)'s paper on PairwiseAsymmetricInference
(PAI). We'll start by defining the their example B (equations 6-7). This system consists of two variables $X$ and $Y$, where $X$ drives $Y$.
After we have computed the PAI in both directions, we define a measure of directionality as the difference between PAI in the $X \to Y$ direction and in the $Y \to X$ direction, so that if $X$ drives $Y$, then $\Delta < 0$.
using Associations
using LabelledArrays
using StaticArrays
using DynamicalSystemsBase
using StateSpaceSets
using CairoMakie, Printf
using Distributions: Normal
using Statistics: mean, std
function eom_nonlinear_sindriver(dx, x, p, n)
a, b, c, t, Δt = (p...,)
x, y = x[1], x[2]
𝒩 = Normal(0, 1)
dx[1] = sin(t)
dx[2] = a*x * (1 - b*x) + c* rand(𝒩)
p[end-1] += 1 # update t
return
end
function nonlinear_sindriver(;u₀ = rand(2), a = 1.0, b = 1.0, c = 2.0, Δt = 1)
DiscreteDynamicalSystem(eom_nonlinear_sindriver, u₀, [a, b, c, 0, Δt])
end
function reproduce_figure_8_mccraken(;
c = 2.0, Δt = 0.2,
as = 0.5:0.5:5.0,
bs = 0.5:0.5:5.0)
# -----------------------------------------------------------------------------------------
# Generate many time series for many different values of the parameters `a` and `b`,
# and compute PAI. This will replicate the upper right panel of
# figure 8 in McCracken & Weigel (2014).
# -----------------------------------------------------------------------------------------
measure = PairwiseAsymmetricInference(d = 3)
# Manually resample `nreps` length-`L` time series and use mean ρ(x̂|X̄y) - ρ(ŷ|Ȳx)
# for each parameter combination.
nreps = 50
L = 200 # length of timeseries
Δ = zeros(length(as), length(bs))
for (i, a) in enumerate(as)
for (j, b) in enumerate(bs)
s = nonlinear_sindriver(; a, b, c, Δt)
x, y = columns(first(trajectory(s, 1000, Ttr = 10000)))
Δreps = zeros(nreps)
for i = 1:nreps
# Ensure we're subsampling at the same time indices.
ind_start = rand(1:(1000-L))
r = ind_start:(ind_start + L)
Δreps[i] = @views crossmap(measure, y[r], x[r]) -
crossmap(measure, x[r], y[r])
end
Δ[i, j] = mean(Δreps)
end
end
# -----------------------------------------------------------------------------------------
# An example time series for plotting.
# -----------------------------------------------------------------------------------------
sys = nonlinear_sindriver(; a = 1.0, b = 1.0, c, Δt)
npts = 500
orbit = first(trajectory(sys, npts, Ttr = 10000))
x, y = columns(orbit)
with_theme(theme_minimal(),
markersize = 5) do
X = x[1:300]
Y = y[1:300]
fig = Figure();
ax_ts = Axis(fig[1, 1:2], xlabel = "Time (t)", ylabel = "Value")
scatterlines!(ax_ts, (X .- mean(X)) ./ std(X), label = "x")
scatterlines!(ax_ts, (Y .- mean(Y)) ./ std(Y), label = "y")
axislegend()
ax_hm = Axis(fig[2, 1:2], xlabel = "a", ylabel = "b")
ax_hm.yticks = (1:length(as), string.([i % 2 == 0 ? as[i] : "" for i = 1:length(as)]))
ax_hm.xticks = (1:length(bs), string.([i % 2 == 0 ? bs[i] : "" for i = 1:length(bs)]))
hm = heatmap!(ax_hm, Δ, colormap = :viridis)
Colorbar(fig[2, 3], hm; label = "Δ' = ρ(ŷ | yx) - ρ(x̂ | xy)")
fig
end
end
reproduce_figure_8_mccraken()
We haven't used as many parameter combinations as McCracken and Weigel (2014) did, but we get a figure that looks roughly similar to theirs.
As expected, $\Delta < 0$ for all parameter combinations, implying that $X$ "PAI drives" $Y$.