Non-exported functions
Symbolization
Entropies.symbolize
— Functionsymbolize(x::AbstractVector{T}, est::SymbolicPermutation) where {T} → Vector{Int}
symbolize!(s, x::AbstractVector{T}, est::SymbolicPermutation) where {T} → Vector{Int}
If x
is a univariate time series, first x
create a delay reconstruction of x
using embedding lag est.τ
and embedding dimension est.m
, then symbolizing the resulting state vectors with encode_motif
.
Optionally, the in-place symbolize!
can be used to put symbols in a pre-allocated integer vector s
, where length(s) == length(x)-(est.m-1)*est.τ
.
symbolize(x::AbstractDataset{m, T}, est::SymbolicPermutation) where {m, T} → Vector{Int}
symbolize!(s, x::AbstractDataset{m, T}, est::SymbolicPermutation) where {m, T} → Vector{Int}
If x
is an m
-dimensional dataset, then motif lengths are determined by the dimension of the input data, and x
is symbolized by converting each m
-dimensional state vector as a unique integer in the range $1, 2, \ldots, m-1$, using encode_motif
.
Optionally, the in-place symbolize!
can be used to put symbols in a pre-allocated integer vector s
, where length(s) == length(s)
.
Examples
Symbolize a 7-dimensional dataset. Motif lengths (or order of the permutations) are inferred to be 7.
using DelayEmbeddings, Entropies
D = Dataset([rand(7) for i = 1:1000])
s = symbolize(D, SymbolicPermutation())
Symbolize a univariate time series by first embedding it in dimension 5 with embedding lag 2. Motif lengths (or order of the permutations) are therefore 5.
using DelayEmbeddings, Entropies
n = 5000
x = rand(n)
s = symbolize(x, SymbolicPermutation(m = 5, τ = 2))
The integer vector s
now has length n-(m-1)*τ = 4992
, and each s[i]
contains the integer symbol for the ordinal pattern of state vector x[i]
.
Entropies.encode_motif
— Functionencode_motif(x, m::Int = length(x)) → s::Int
Encode the length-m
motif x
(a vector of indices that would sort some vector v
in ascending order) into its unique integer symbol $s \in \{1, 2, \ldots, m - 1 \}$, using Algorithm 1 in Berger et al. (2019)[Berger2019].
Example
v = rand(5)
# The indices that would sort `v` in ascending order. This is now a permutation
# of the index permutation (1, 2, ..., 5)
x = sortperm(v)
# Encode this permutation as an integer.
encode_motif(x)
Binning-related
Entropies.encode_as_bin
— Functionencode_as_bin(point, reference_point, edgelengths) → Vector{Int}
Encode a point into its integer bin labels relative to some reference_point
(always counting from lowest to highest magnitudes), given a set of box edgelengths
(one for each axis). The first bin on the positive side of the reference point is indexed with 0, and the first bin on the negative side of the reference point is indexed with -1.
See also: joint_visits
, marginal_visits
.
Example
using Entropies
refpoint = [0, 0, 0]
steps = [0.2, 0.2, 0.3]
encode_as_bin(rand(3), refpoint, steps)
Entropies.joint_visits
— Functionjoint_visits(points, binning_scheme::RectangularBinning) → Vector{Vector{Int}}
Determine which bins are visited by points
given the rectangular binning scheme ϵ
. Bins are referenced relative to the axis minima, and are encoded as integers, such that each box in the binning is assigned a unique integer array (one element for each dimension).
For example, if a bin is visited three times, then the corresponding integer array will appear three times in the array returned.
See also: marginal_visits
, encode_as_bin
.
Example
using DelayEmbeddings, Entropies
pts = Dataset([rand(5) for i = 1:100]);
joint_visits(pts, RectangularBinning(0.2))
Entropies.marginal_visits
— Functionmarginal_visits(points, binning_scheme::RectangularBinning, dims) → Vector{Vector{Int}}
Determine which bins are visited by points
given the rectangular binning scheme ϵ
, but only along the desired dimensions dims
. Bins are referenced relative to the axis minima, and are encoded as integers, such that each box in the binning is assigned a unique integer array (one element for each dimension in dims
).
For example, if a bin is visited three times, then the corresponding integer array will appear three times in the array returned.
See also: joint_visits
, encode_as_bin
.
Example
using DelayEmbeddings, Entropies
pts = Dataset([rand(5) for i = 1:100]);
# Marginal visits along dimension 3 and 5
marginal_visits(pts, RectangularBinning(0.3), [3, 5])
# Marginal visits along dimension 2 through 5
marginal_visits(pts, RectangularBinning(0.3), 2:5)
marginal_visits(joint_visits, dims) → Vector{Vector{Int}}
If joint visits have been precomputed using joint_visits
, marginal visits can be returned directly without providing the binning again using the marginal_visits(joint_visits, dims)
signature.
See also: joint_visits
, encode_as_bin
.
Example
using DelayEmbeddings, Entropies
pts = Dataset([rand(5) for i = 1:100]);
# First compute joint visits, then marginal visits along dimensions 1 and 4
jv = joint_visits(pts, RectangularBinning(0.2))
marginal_visits(jv, [1, 4])
# Marginals along dimension 2
marginal_visits(jv, 2)
- Berger2019Berger, Sebastian, et al. "Teaching Ordinal Patterns to a Computer: Efficient Encoding Algorithms Based on the Lehmer Code." Entropy 21.10 (2019): 1023.
- Berger2019Berger, Sebastian, et al. "Teaching Ordinal Patterns to a Computer: Efficient Encoding Algorithms Based on the Lehmer Code." Entropy 21.10 (2019): 1023.