Unified Optimal Embedding

Unified approaches try to create an optimal embedding by in parallel optimizing what combination of delay times and embedding dimensions suits best.

In addition, the unified approaches are the only ones that can accommodate multi-variate inputs. This means that if you have multiple measured input timeseries, you should be able to take advantage of all of them for the best possible embedding of the dynamical system's set.

An example

In following we illustrate the most recent unified optimal embedding method, called PECUZAL, on three examples. We start with a univariate case, i.e. we only feed in one time series, here the x-component of the Lorenz system.

using DynamicalSystems

lo = Systems.lorenz([1.0, 1.0, 50.0])
tr = trajectory(lo, 100; dt = 0.01, Ttr = 10)

s = vec(tr[:, 1]) # input timeseries = x component of Lorenz
theiler = estimate_delay(s, "mi_min") # estimate a Theiler window
Tmax = 100 # maximum possible delay

Y, τ_vals, ts_vals, Ls , εs = pecuzal_embedding(s; τs = 0:Tmax , w = theiler, econ = true)

println(τ_vals)
println(ts_vals)
println(Ls)
println("L_total_uni: $(sum(Ls))")
Algorithm stopped due to increasing L-values. VALID embedding achieved ✓.
[0, 18, 9]
[1, 1, 1]
[-0.8105749460692708, -0.4331455591872686]
L_total_uni: -1.2437205052565394

The output reveals that PECUZAL suggests a 3-dimensional embedding out of the un-lagged time series as the 1st component of the reconstruction, the time series lagged by 18 samples as the 2nd component and the time series lagged by 9 samples as the 3rd component. In the third embedding cycle there is no ΔL<0 and the algorithm breaks. The result after two successful embedding cycles is the 3-dimensional embedding shown above. The total obtained decrease of ΔL throughout all encountered embedding cycles has been ~-1.24.

using PyPlot

figure(figsize=(14., 8.))
subplot(1,2,1, projection="3d")
plot3D(Y[:,1], Y[:,2], Y[:,3],"gray")
title("PECUZAL reconstructed")
xlabel("x(t+$(τ_vals[1]))")
ylabel("x(t+$(τ_vals[2]))")
zlabel("x(t+$(τ_vals[3]))")
grid()

subplot(1,2,2, projection="3d")
plot3D(tr[:,1], tr[:,2], tr[:,3],"gray")
title("Original")
xlabel("x(t)")
ylabel("y(t)")
zlabel("z(t)")
grid()

tight_layout()

We can also look at the output of the low-level function leading to the results, here the continuity statistic.

using PyPlot

figure()
plot(εs[:,1], label="1st embedding cycle")
scatter([τ_vals[2]], [εs[τ_vals[2],1]])
plot(εs[:,2], label="2nd embedding cycle")
scatter([τ_vals[3]], [εs[τ_vals[3],2]])
plot(εs[:,3], label="3rd embedding cycle")
title("Continuity statistics PECUZAL Lorenz")
xlabel("delay τ")
ylabel("⟨ε⋆⟩")
legend(loc="center right"; fontsize = 20)
grid()

The picked delay values are marked with filled circles. As already mentioned, the third embedding cycle did not contribute to the embedding, i.e. there has been no delay value chosen. Similar to the approach in the preceding example, we now highlight the capability of the PECUZAL embedding method for a multivariate input. The idea is now to feed in all three time series to the algorithm, even though this is a very far-from-reality example. We already have an adequate representation of the system we want to reconstruct, namely the three time series from the numerical integration. But let us see what PECUZAL suggests for a reconstruction.

# compute Theiler window
w1 = estimate_delay(tr[:,1], "mi_min")
w2 = estimate_delay(tr[:,2], "mi_min")
w3 = estimate_delay(tr[:,3], "mi_min")
w = maximum(hcat(w1,w2,w3))
Y_m, τ_vals_m, ts_vals_m, Ls_m , εs_m = pecuzal_embedding(tr; τs = 0:Tmax , w = theiler, econ = true)

println(τ_vals_m)
println(ts_vals_m)
println(Ls_m)
println("L_total_multi: $(sum(Ls_m))")
Algorithm stopped due to increasing L-values. VALID embedding achieved ✓.
[0, 12, 0, 79, 64, 53]
[3, 1, 1, 1, 1, 1]
[-0.9741206080702898, -0.5063498662307184, -0.0469210429481719, -0.10693317777858313, -0.009703558076084917]
L_total_multi: -1.6440282531038481

PECUZAL returns a 6-dimensional embedding using the un-lagged z- and x-component as 1st and 3rd component of the reconstruction vectors, as well as the x-component lagged by 12, 79, 64, and 53 samples. The total decrease of ΔL is ~-1.64, and thus, way smaller compared to the univariate case, as we would expect it. Nevertheless, the main contribution to this increase is made by the first two embedding cycles. For surpressing embedding cycles, which yield negligible - but negative - ΔL-values one can use the keyword argument L_threshold

Y_mt, τ_vals_mt, ts_vals_mt, Ls_mt , εs_mt = pecuzal_embedding(tr; τs = 0:Tmax , L_threshold = 0.05, w = theiler, econ = true)

println(τ_vals_mt)
println(ts_vals_mt)
println(Ls_mt)
println("L_total_multi_t: $(sum(Ls_mt))")
Algorithm stopped due to increasing L-values. VALID embedding achieved ✓.
[0, 12, 0]
[3, 1, 1]
[-0.9741206080702898, -0.5063498662307184]
L_total_multi_t: -1.4804704743010082

Let's plot these three components:

ts_str = ["x", "y", "z"]

figure(figsize=(14., 8.))
subplot(1,2,1, projection="3d")
plot3D(Y_m[:,1], Y_m[:,2], Y_m[:,3],"gray")
title("PECUZAL reconstructed")
xlabel("$(ts_str[ts_vals_m[1]])(t+$(τ_vals_m[1]))")
ylabel("$(ts_str[ts_vals_m[2]])(t+$(τ_vals_m[2]))")
zlabel("$(ts_str[ts_vals_m[3]])(t+$(τ_vals_m[3]))")
grid()

subplot(1,2,2, projection="3d")
plot3D(tr[:,1], tr[:,2], tr[:,3],"gray")
title("Original")
xlabel("x(t)")
ylabel("y(t)")
zlabel("z(t)")
grid()

tight_layout()

Finally we show what PECUZAL does with a non-deterministic source:

using Random

# Dummy input
Random.seed!(1234)
d1 = randn(1000)
d2 = rand(1000)
Tmax = 100
dummy_set = Dataset(d1,d2)

w1 = estimate_delay(d1, "mi_min")
w2 = estimate_delay(d2, "mi_min")
theiler = min(w1, w2)

Y_d, τ_vals_d, ts_vals_d, Ls_d , ε★_d = pecuzal_embedding(dummy_set; τs = 0:Tmax , w = theiler, econ = true)

size(Y_d)
(1000,)

So, no (proper) embedding is done.

All unified algorithms

Several algorithms have been created to implement a unified approach to delay coordinates embedding. You can find some implementations below:

DelayEmbeddings.pecoraFunction
pecora(s, τs, js; kwargs...) → ⟨ε★⟩, ⟨Γ⟩

Compute the (average) continuity statistic ⟨ε★⟩ and undersampling statistic ⟨Γ⟩ according to Pecora et al.[Pecoral2007] (A unified approach to attractor reconstruction), for a given input s (timeseries or Dataset) and input generalized embedding defined by (τs, js), according to genembed. The continuity statistic represents functional independence between the components of the existing embedding and one additional timeseries. The returned results are matrices with size TxJ.

Keyword arguments

  • delays = 0:50: Possible time delay values delays (in sampling time units). For each of the τ's in delays the continuity-statistic ⟨ε★⟩ gets computed. If undersampling = true (see further down), also the undersampling statistic ⟨Γ⟩ gets returned for all considered delay values.
  • J = 1:dimension(s): calculate for all timeseries indices in J. If input s is a timeseries, this is always just 1.
  • samplesize::Real = 0.1: determine the fraction of all phase space points (=length(s)) to be considered (fiducial points v) to average ε★ to produce ⟨ε★⟩, ⟨Γ⟩
  • K::Int = 13: the amount of nearest neighbors in the δ-ball (read algorithm description). Must be at least 8 (in order to gurantee a valid statistic). ⟨ε★⟩ is computed taking the minimum result over all k ∈ K.
  • metric = Chebyshev(): metrix with which to find nearest neigbhors in the input embedding (ℝᵈ space, d = length(τs)).
  • w = 1: Theiler window (neighbors in time with index w close to the point, that are excluded from being true neighbors). w=0 means to exclude only the point itself, and no temporal neighbors.
  • undersampling = false : whether to calculate the undersampling statistic or not (if not, zeros are returned for ⟨Γ⟩). Calculating ⟨Γ⟩ is thousands of times slower than ⟨ε★⟩.
  • db::Int = 100: Amount of bins used into calculating the histograms of each timeseries (for the undersampling statistic).
  • α::Real = 0.05: The significance level for obtaining the continuity statistic
  • p::Real = 0.5: The p-parameter for the binomial distribution used for the computation of the continuity statistic.

Description

Notice that the full algorithm is too large to discuss here, and is written in detail (several pages!) in the source code of pecora.

DelayEmbeddings.uzal_costFunction
uzal_cost(Y::Dataset; kwargs...) → L

Compute the L-statistic L for input dataset Y according to Uzal et al.[Uzal2011], based on theoretical arguments on noise amplification, the complexity of the reconstructed attractor and a direct measure of local stretch which constitutes an irrelevance measure. It serves as a cost function of a state space trajectory/embedding and therefore allows to estimate a "goodness of a embedding" and also to choose proper embedding parameters, while minimizing L over the parameter space. For receiving the local cost function L_local (for each point in state space - not averaged), use uzal_cost_local(...).

Keyword arguments

  • samplesize = 0.5: Number of considered fiducial points v as a fraction of input state space trajectory Y's length, in order to average the conditional variances and neighborhood sizes (read algorithm description) to produce L.
  • K = 3: the amount of nearest neighbors considered, in order to compute σ_k^2 (read algorithm description). If given a vector, minimum result over all k ∈ K is returned.
  • metric = Euclidean(): metric used for finding nearest neigbhors in the input state space trajectory `Y.
  • w = 1: Theiler window (neighbors in time with index w close to the point, that are excluded from being true neighbors). w=0 means to exclude only the point itself, and no temporal neighbors.
  • Tw = 40: The time horizon (in sampling units) up to which E_k^2 gets computed and averaged over (read algorithm description).

Description

The L-statistic is based on theoretical arguments on noise amplification, the complexity of the reconstructed attractor and a direct measure of local stretch which constitutes an irrelevance measure. Technically, it is the logarithm of the product of σ-statistic and a normalization statistic α:

L = log10(σ*α)

The σ-statistic is computed as follows. σ = √σ² = √(E²/ϵ²). approximates the conditional variance at each point in state space and for a time horizon T ∈ Tw, using K nearest neighbors. For each reference point of the state space trajectory, the neighborhood consists of the reference point itself and its K+1 nearest neighbors. measures how strong a neighborhood expands during T time steps. is averaged over many time horizons T = 1:Tw. Consequently, ϵ² is the size of the neighborhood at the reference point itself and is defined as the mean pairwise distance of the neighborhood. Finally, σ² gets averaged over a range of reference points on the attractor, which is controlled by samplesize. This is just for performance reasons and the most accurate result will obviously be gained when setting samplesize=1.0

The α-statistic is a normalization factor, such that σ's from different embeddings can be compared. α² is defined as the inverse of the sum of the inverse of all ϵ²'s for all considered reference points.

DelayEmbeddings.garcia_almeida_embeddingFunction
garcia_almeida_embedding(s; kwargs...) → Y, τ_vals, ts_vals, FNNs ,NS

A unified approach to properly embed a time series (Vector type) or a set of time series (Dataset type) based on the papers of Garcia & Almeida [Garcia2005a],[Garcia2005b].

Keyword arguments

  • τs= 0:50: Possible delay values τs (in sampling time units). For each of the τs's the N-statistic gets computed.
  • w::Int = 1: Theiler window (neighbors in time with index w close to the point, that are excluded from being true neighbors). w=0 means to exclude only the point itself, and no temporal neighbors.
  • r1 = 10: The threshold, which defines the factor of tolerable stretching for the d_E1-statistic.
  • r2 = 2: The threshold for the tolerable relative increase of the distance between the nearest neighbors, when increasing the embedding dimension.
  • fnn_thres= 0.05: A threshold value defining a sufficiently small fraction of false nearest neighbors, in order to the let algorithm terminate and stop the embedding procedure (`0 ≤ fnn_thres < 1).
  • T::Int = 1: The forward time step (in sampling units) in order to compute the d_E2-statistic (see algorithm description). Note that in the paper this is not a free parameter and always set to T=1.
  • metric = Euclidean(): metric used for finding nearest neigbhors in the input phase space trajectory Y.
  • max_num_of_cycles = 50: The algorithm will stop after that many cycles no matter what.

Description

The method works iteratively and gradually builds the final embedding vectors Y. Based on the N-statistic the algorithm picks an optimal delay value τ for each embedding cycle as the first local minimum of N. In case of multivariate embedding, i.e. when embedding a set of time series (s::Dataset), the optimal delay value τ is chosen as the first minimum from all minimum's of all considered N-statistics for each embedding cycle. The range of considered delay values is determined in τs and for the nearest neighbor search we respect the Theiler window w. After each embedding cycle the FNN-statistic FNNs [Hegger1999][Kennel1992] is being checked and as soon as this statistic drops below the threshold fnn_thres, the algorithm breaks. In order to increase the practability of the method the algorithm also breaks, when the FNN-statistic FNNs increases . The final embedding vector is stored in Y (Dataset). The chosen delay values for each embedding cycle are stored in the τ_vals and the according time series number chosen for the according delay value in τ_vals is stored in ts_vals. For univariate embedding (s::Vector) ts_vals is a vector of ones of length τ_vals, because there is simply just one time series to choose from. The function also returns the N-statistic NS for each embedding cycle as an Array of Vectors.

Notice that we were not able to reproduce the figures from the papers with our implementation (which nevertheless we believe is the correct one).

DelayEmbeddings.mdop_embeddingFunction
mdop_embedding(s::Vector; kwargs...) → Y, τ_vals, ts_vals, FNNs, βS

MDOP (for "maximizing derivatives on projection") is a unified approach to properly embed a timeseries or a set of timeseries (Dataset) based on the paper of Chetan Nichkawde [Nichkawde2013].

Keyword arguments

  • τs= 0:50: Possible delay values τs. For each of the τs's the β-statistic gets computed.
  • w::Int = 1: Theiler window (neighbors in time with index w close to the point, that are excluded from being true neighbors). w=0 means to exclude only the point itself, and no temporal neighbors.
  • fnn_thres::Real= 0.05: A threshold value defining a sufficiently small fraction of false nearest neighbors, in order to the let algorithm terminate and stop the embedding procedure (`0 ≤ fnn_thres < 1).
  • r::Real = 2: The threshold for the tolerable relative increase of the distance between the nearest neighbors, when increasing the embedding dimension.
  • max_num_of_cycles = 50: The algorithm will stop after that many cycles no matter what.

Description

The method works iteratively and gradually builds the final embedding Y. Based on the beta_statistic the algorithm picks an optimal delay value τ for each embedding cycle as the global maximum of β. In case of multivariate embedding, i.e. when embedding a set of time series (s::Dataset), the optimal delay value τ is chosen as the maximum from all maxima's of all considered β-statistics for each possible timeseries. The range of considered delay values is determined in τs and for the nearest neighbor search we respect the Theiler window w.

After each embedding cycle the FNN-statistic FNNs [Hegger1999][Kennel1992] is being checked and as soon as this statistic drops below the threshold fnn_thres, the algorithm terminates. In order to increase the practability of the method the algorithm also terminates when the FNN-statistic FNNs increases.

The final embedding is returned as Y. The chosen delay values for each embedding cycle are stored in the τ_vals and the according timeseries index chosen for the the respective according delay value in τ_vals is stored in ts_vals. βS, FNNs are returned for clarity and double-checking, since they are computed anyway. In case of multivariate embedding, βS will store all β-statistics for all available time series in each embedding cycle. To double-check the actual used β-statistics in an embedding cycle 'k', simply βS[k][:,ts_vals[k+1]].

DelayEmbeddings.pecuzal_embeddingFunction
pecuzal_embedding(s; kwargs...) → 𝒟, τ_vals, ts_vals, ΔLs, ⟨ε★⟩

A unified approach to properly embed a time series or a set of time series (Dataset) based on the ideas of Pecora et al. [Pecoral2007] and Uzal et al. [Uzal2011]. For a detailled description of the algorithm see Kraemer et al. [Kraemer2020].

Keyword arguments

  • τs = 0:50: Possible delay values τs (in sampling time units). For each of the τs's the continuity statistic ⟨ε★⟩ gets computed and further processed in order to find optimal delays τᵢ for each embedding cycle i (read algorithm description).
  • w::Int = 1: Theiler window (neighbors in time with index w close to the point, that are excluded from being true neighbors). w=0 means to exclude only the point itself, and no temporal neighbors.
  • samplesize::Real = 1: determine the fraction of all phase space points (=length(s)) to be considered (fiducial points v) to average ε★, in order to produce ⟨ε★⟩.
  • K::Int = 13: the amount of nearest neighbors in the δ-ball (read algorithm description). Must be at least 8 (in order to gurantee a valid statistic). ⟨ε★⟩ is computed taking the minimum result over all k ∈ K.
  • KNN::Int = 3: the amount of nearest neighbors considered, in order to compute σk^2 (read algorithm description [`uzalcost]@ref). If given a vector, the minimum result over allknn ∈ KNN` is returned.
  • L_threshold::Real = 0: The algorithm breaks, when this threshold is exceeded by ΔL in an embedding cycle (set as a positive number, i.e. an absolute value of ΔL).
  • α::Real = 0.05: The significance level for obtaining the continuity statistic
  • p::Real = 0.5: The p-parameter for the binomial distribution used for the computation of the continuity statistic ⟨ε★⟩.
  • max_cycles = 50: The algorithm will stop after that many cycles no matter what.
  • econ::Bool = false: Economy-mode for L-statistic computation. Instead of computing L-statistics for time horizons 2:Tw, here we only compute them for 2:2:Tw, see description for further details.

Description

The method works iteratively and gradually builds the final embedding vectors Y. Based on the ⟨ε★⟩-statistic pecora the algorithm picks an optimal delay value τᵢ for each embedding cycle i. For achieving that, we take the inpute time series s, denoted as the actual phase space trajectory Y_actual and compute the continuity statistic ⟨ε★⟩.

  1. Each local maxima in ⟨ε★⟩ is used for constructing a

candidate embedding trajectory Y_trial with a delay corresponding to that specific peak in ⟨ε★⟩. 2. We then compute the L-statistic uzal_cost for Y_trial (L-trial) and Y_actual (L_actual) for increasing prediction time horizons (free parameter in the L-statistic) and save the maximum difference max(L-trial - L_actual) as ΔL (Note that this is a negative number, since the L-statistic decreases with better reconstructions).

  1. We pick the peak/τ-value, for which ΔL is minimal (=maximum decrease of

the overall L-value) and construct the actual embedding trajectory Y_actual (steps 1.-3. correspond to an embedding cycle). 4. We repeat steps 1.-3. with Y_actual as input and stop the algorithm when ΔL is > 0, i.e. when and additional embedding component would not lead to a lower overall L-value. Y_actual -> Y.

In case of multivariate embedding, i.e. when embedding a set of M time series (s::Dataset), in each embedding cycle the continuity statistic ⟨ε★⟩ gets computed for all M time series available. The optimal delay value τ in each embedding cycle is chosen as the peak/τ-value for which ΔL is minimal under all available peaks and under all M ⟨ε★⟩'s. In the first embedding cycle there will be M² different ⟨ε★⟩'s to consider, since it is not clear a priori which time series of the input should consitute the first component of the embedding vector and form Y_actual.

The range of considered delay values is determined in τs and for the nearest neighbor search we respect the Theiler window w. The final embedding vector is stored in Y (Dataset). The chosen delay values for each embedding cycle are stored in τ_vals and the according time series numbers chosen for each delay value in τ_vals are stored in ts_vals. For univariate embedding (s::Vector) ts_vals is a vector of ones of length τ_vals, because there is simply just one time series to choose from. The function also returns the ΔLs-values for each embedding cycle and the continuity statistic ⟨ε★⟩ as an Array of Vectors.

For distance computations the Euclidean norm is used.

Low-level functions of unified approach

DelayEmbeddings.n_statisticFunction
n_statistic(Y, s; kwargs...) → N, d_E1

Perform one embedding cycle according to the method proposed in [Garcia2005a] for a given phase space trajectory Y (of type Dataset) and a time series s (of typeVector). Return the proposed N-StatisticNand all nearest neighbor distancesd_E1for each point of the input phase space trajectoryY. Note thatY` is a single time series in case of the first embedding cycle.

Keyword arguments

  • τs= 0:50: Considered delay values τs (in sampling time units). For each of the τs's the N-statistic gets computed.
  • r = 10: The threshold, which defines the factor of tolerable stretching for the d_E1-statistic (see algorithm description).
  • T::Int = 1: The forward time step (in sampling units) in order to compute the d_E2-statistic (see algorithm description). Note that in the paper this is not a free parameter and always set to T=1.
  • w::Int = 0: Theiler window (neighbors in time with index w close to the point, that are excluded from being true neighbors). w=0 means to exclude only the point itself, and no temporal neighbors. Note that in the paper this is not a free parameter and always w=0.
  • metric = Euclidean(): metric used for finding nearest neigbhors in the input phase space trajectory Y.

Description

For a range of possible delay values τs one constructs a temporary embedding matrix. That is, one concatenates the input phase space trajectory Y with the τ-lagged input time series s. For each point on the temporary trajectory one computes its nearest neighbor, which is denoted as the d_E1-statistic for a specific τ. Now one considers the distance between the reference point and its nearest neighbor T sampling units ahead and calls this statistic d_E2. [Garcia2005a] strictly use T=1, so they forward each reference point and its corresponding nearest neighbor just by one (!) sampling unit. Here it is a free parameter.

The N-statistic is then the fraction of d_E2/d_E1-pairs which exceed a threshold r.

Plotted vs. the considered τs-values it is proposed to pick the τ-value for this embedding cycle as the value, where N has its first local minimum.

DelayEmbeddings.beta_statisticFunction
beta_statistic(Y::Dataset, s::Vector) [, τs, w]) → β

Compute the β-statistic β for input state space trajectory Y and a timeseries s according to Nichkawde [Nichkawde2013], based on estimating derivatives on a projected manifold. For a range of delay values τs, β gets computed and its maximum over all considered τs serves as the optimal delay considered in this embedding cycle.

Arguments τs, w as in mdop_embedding.

Description

The β-statistic is based on the geometrical idea of maximal unfolding of the reconstructed attractor and is tightly related to the False Nearest Neighbor method ([Kennel1992]). In fact the method eliminates the maximum amount of false nearest neighbors in each embedding cycle. The idea is to estimate the absolute value of the directional derivative with respect to a possible new dimension in the reconstruction process, and with respect to the nearest neighbor, for all points of the state space trajectory:

ϕ'(τ) = Δϕd(τ) / Δxd

Δxd is simply the Euclidean nearest neighbor distance for a reference point with respect to the given Theiler window w. Δϕd(τ) is the distance of the reference point to its nearest neighbor in the one dimensional time series s, for the specific τ. Δϕ_d(τ) = |s(i+τ)-s(j+τ)|, with i being the index of the considered reference point and j the index of its nearest neighbor.

Finally,

β = log β(τ) = ⟨log₁₀ ϕ'(τ)⟩ ,

with ⟨.⟩ being the mean over all reference points. When one chooses the maximum of β over all considered τ's, one obtains the optimal delay value for this embedding cycle. Note that in the first embedding cycle, the input state space trajectory Y can also be just a univariate time series.

DelayEmbeddings.mdop_maximum_delayFunction
mdop_maximum_delay(s, tw = 1:50, samplesize = 1.0)) -> τ_max, L

Compute an upper bound for the search of optimal delays, when using mdop_embedding mdop_embedding or beta_statistic beta_statistic.

Description

The input time series s gets embedded with unit lag and increasing dimension, for dimensions (or time windows) tw (RangeObject). For each of such a time window the L-statistic from Uzal et al. [Uzal2011] will be computed. samplesize determines the fraction of points to be considered in the computation of L (see uzal_cost). When this statistic reaches its global minimum the maximum delay value τ_max gets returned. When s is a multivariate Dataset, τ_max will becomputed for all timeseries of that Dataset and the maximum value will be returned. The returned L-statistic has size (length(tw), size(s,2)).