Weighted permutation (symbolic)

Entropies.SymbolicWeightedPermutation — Type

SymbolicWeightedPermutation <: PermutationProbabilityEstimator

A symbolic, weighted permutation based probabilities/entropy estimator.

Properties of original signal preserved

Weighted permutations of a signal preserve not only ordinal patterns (sorting information), but also encodes amplitude information. This implementation is based on Fadlallah et al. (2013)^{[Fadlallah2013]}.

Description

Consider the $n$-element univariate time series $\{x(t) = x_1, x_2, \ldots, x_n\}$. Let $\mathbf{x_i}^{m, \tau} = \{x_j, x_{j+\tau}, \ldots, x_{j+(m-1)\tau}\}$ for $j = 1, 2, \ldots n - (m-1)\tau$ be the $i$-th state vector in a delay reconstruction with embedding dimension $m$ and reconstruction lag $\tau$. There are then $N = n - (m-1)\tau$ state vectors.

For an $m$-dimensional vector, there are $m!$ possible ways of sorting it in ascending order of magnitude. Each such possible sorting ordering is called a motif. Let $\pi_i^{m, \tau}$ denote the motif associated with the $m$-dimensional state vector $\mathbf{x_i}^{m, \tau}$, and let $R$ be the number of distinct motifs that can be constructed from the $N$ state vectors. Then there are at most $R$ motifs; $R = N$ precisely when all motifs are unique, and $R = 1$ when all motifs are the same. Each unique motif $\pi_i^{m, \tau}$ can be mapped to a unique integer symbol $0 \leq s_i \leq M!-1$. Let $S(\pi) : \mathbb{R}^m \to \mathbb{N}_0$ be the function that maps the motif $\pi$ to its symbol $s$, and let $\Pi$ denote the set of symbols $\Pi = \{ s_i \}_{i\in \{ 1, \ldots, R\}}$.

Weighted permutation entropy is computed analogously to regular permutation entropy, but adds weights that encode amplitude information too:

\[p(\pi_i^{m, \tau}) = \dfrac{\sum_{k=1}^N \mathbf{1}_{u:S(u) = s_i} \left( \mathbf{x}_k^{m, \tau} \right) \, w_k}{\sum_{k=1}^N \mathbf{1}_{u:S(u) \in \Pi} \left( \mathbf{x}_k^{m, \tau} \right) \,w_k} = \dfrac{\sum_{k=1}^N \mathbf{1}_{u:S(u) = s_i} \left( \mathbf{x}_k^{m, \tau} \right) \, w_k}{\sum_{k=1}^N w_k}.\]

The weighted permutation entropy is equivalent to regular permutation entropy when weights are positive and identical ($w_j = \beta \,\,\, \forall \,\,\, j \leq N$ and $\beta > 0)$. Weights are dictated by the variance of the state vectors.

Let the aritmetic mean of state vector $\mathbf{x}_i$ be denoted by

\[\mathbf{\hat{x}}_j^{m, \tau} = \frac{1}{m} \sum_{k=1}^m x_{j + (k+1)\tau}.\]

Weights are then computed as

\[w_j = \dfrac{1}{m}\sum_{k=1}^m (x_{j+(k+1)\tau} - \mathbf{\hat{x}}_j^{m, \tau})^2.\]

Implementation details

Note: in equation 7, section III, of the original paper, the authors write

\[w_j = \dfrac{1}{m}\sum_{k=1}^m (x_{j-(k-1)\tau} - \mathbf{\hat{x}}_j^{m, \tau})^2.\]

But given the formula they give for the arithmetic mean, this is not the variance of $\mathbf{x}_i$, because the indices are mixed: $x_{j+(k-1)\tau}$ in the weights formula, vs. $x_{j+(k+1)\tau}$ in the arithmetic mean formula. This seems to imply that amplitude information about previous delay vectors are mixed with mean amplitude information about current vectors. The authors also mix the terms "vector" and "neighboring vector" (but uses the same notation for both), making it hard to interpret whether the sign switch is a typo or intended. Here, we use the notation above, which actually computes the variance for $\mathbf{x}_i$.

Estimation from univariate time series/datasets

To compute weighted permutation entropy for a univariate signal x, use the signature entropy(x::AbstractVector, est::SymbolicWeightedPermutation; τ::Int = 1, m::Int = 3).
The corresponding (unordered) probability distribution of the permutation symbols for a univariate signal x can be computed using probabilities(x::AbstractVector, est::SymbolicWeightedPermutation; τ::Int = 1, m::Int = 3).

Default embedding dimension and embedding lag

By default, embedding dimension $m = 3$ with embedding lag $\tau = 1$ is used. You should probably make a more informed decision about embedding parameters when computing the permutation entropy of a real dataset. In all cases, $m$ must be at least 2 (there are no permutations of a single-element state vector, so need $m \geq 2$).

Estimation from multivariate time series/datasets

Although not dealt with in the original paper, numerically speaking, weighted permutation entropy, just like regular permutation entropy, can also be computed for multivariate datasets (either embedded or consisting of multiple time series variables). This assumes that the mixed symbols described above are actually a typo.

Then, just skip the delay reconstruction step, compute symbols directly from the $L$ existing state vectors $\{\mathbf{x}_1, \mathbf{x}_2, \ldots, \mathbf{x_L}\}$, symbolize each $\mathbf{x_i}$ precisely as above, then compute the quantity

\[H = - \sum_j p(\pi) \ln p(\pi_j).\]

To compute weighted permutation entropy for a multivariate/embedded dataset x, use the signature entropy(x::AbstractDataset, est::SymbolicWeightedPermutation).
To get the corresponding probability distribution for a multivariate/embedded dataset x, use probabilities(x::AbstractDataset, est::SymbolicWeightedPermutation).

Dynamical interpretation

A dynamical interpretation of the permutation entropy does not necessarily hold if computing it on generic multivariate datasets. Method signatures for Datasets are provided for convenience, and should only be applied if you understand the relation between your input data, the numerical value for the weighted permutation entropy, and its interpretation.

source

Fadlallah2013Fadlallah, Bilal, et al. "Weighted-permutation entropy: A complexity measure for time series incorporating amplitude information." Physical Review E 87.2 (2013): 022911.