Generalized entropy

Entropies.genentropyFunction
genentropy(α::Real, p::AbstractArray; base = Base.MathConstants.e)

Compute the entropy, to the given base, of an array of probabilities p, assuming that p is sum-normalized.

Description

Let $p$ be an array of probabilities (summing to 1). Then the Rényi entropy is

\[H_\alpha(p) = \frac{1}{1-\alpha} \log \left(\sum_i p[i]^\alpha\right)\]

and generalizes other known entropies, like e.g. the information entropy ($\alpha = 1$, see [Shannon1948]), the maximum entropy ($\alpha=0$, also known as Hartley entropy), or the correlation entropy ($\alpha = 2$, also known as collision entropy).

Example

using Entropies
p = rand(5000)
p = p ./ sum(p) # normalizing to 1 ensures we have a probability distribution

# Estimate order-1 generalized entropy to base 2 of the distribution
Entropies.genentropy(1, ps, base = 2)

See also: non0hist.

source

Permutation entropy

genentropy(x::Dataset, est::SymbolicPermutation, α::Real = 1; base = 2) → Real
genentropy(x::AbstractVector{<:Real}, est::SymbolicPermutation, α::Real = 1; m::Int = 3, τ::Int = 1, base = 2) → Real

genentropy!(s::Vector{Int}, x::Dataset, est::SymbolicPermutation, α::Real = 1; base = 2) → Real
genentropy!(s::Vector{Int}, x::AbstractVector{<:Real}, est::SymbolicPermutation, α::Real = 1; m::Int = 3, τ::Int = 1, base = 2) → Real

Compute the generalized order-α entropy over a permutation symbolization of x, using symbol size/order m.

If x is a multivariate Dataset, then symbolization is performed directly on the state vectors. If x is a univariate signal, then a delay reconstruction with embedding lag τ and embedding dimension m is used to construct state vectors, on which symbolization is then performed.

A pre-allocated symbol array s can be provided to save some memory allocations if probabilities are to be computed for multiple data sets. If provided, it is required that length(x) == length(s) if x is a Dataset, or length(s) == length(x) - (m-1)τ if x is a univariate signal.

Probability estimation

An unordered symbol frequency histogram is obtained by symbolizing the points in x, using probabilities(::Dataset, ::SymbolicPermutation). Sum-normalizing this histogram yields a probability distribution over the symbols.

Entropy estimation

After the symbolization histogram/distribution has been obtained, the order α generalized entropy[Rényi1960], to the given base, is computed from that sum-normalized symbol distribution, using genentropy.

Notes

Do not confuse the order of the generalized entropy (α) with the order m of the permutation entropy (m, which controls the symbol size). Permutation entropy is usually estimated with α = 1, but the implementation here allows the generalized entropy of any dimension to be computed from the symbol frequency distribution.

See also: SymbolicPermutation, genentropy.

source

Weighted permutation entropy

genentropy(x::Dataset, est::SymbolicWeightedPermutation, α::Real = 1; base = 2) → Real
genentropy(x::AbstractVector{<:Real}, est::SymbolicWeightedPermutation, α::Real = 1; m::Int = 3, τ::Int = 1, base = 2) → Real

Compute the generalized order α entropy based on a weighted permutation symbolization of x, using symbol size/order m for the permutations.

If x is a multivariate Dataset, then symbolization is performed directly on the state vectors. If x is a univariate signal, then a delay reconstruction with embedding lag τ and embedding dimension m is used to construct state vectors, on which symbolization is then performed.

Probability estimation

An unordered symbol frequency histogram is obtained by symbolizing the points in x by a weighted procedure, using probabilities(::Dataset, ::SymbolicWeightedPermutation). Sum-normalizing this histogram yields a probability distribution over the weighted symbols.

Entropy estimation

After the symbolization histogram/distribution has been obtained, the order α generalized entropy[Rényi1960], to the given base, is computed from that sum-normalized symbol distribution, using genentropy.

Notes

Do not confuse the order of the generalized entropy (α) with the order m of the permutation entropy (m, which controls the symbol size). Permutation entropy is usually estimated with α = 1, but the implementation here allows the generalized entropy of any dimension to be computed from the symbol frequency distribution.

See also: SymbolicWeightedPermutation, genentropy.

source

Visitation frequency, binning based entropy

genentropy(x::Dataset, est::VisitationFrequency, α::Real = 1; base::Real = 2)

Compute the order-α generalized (Rényi) entropy[Rényi1960] of a multivariate dataset x using a visitation frequency approach.

Description

First, the state space defined by x is partitioned into rectangular boxes according to the binning instructions given by est.binning. Then, a histogram of visitations to each of those boxes is obtained, which is then sum-normalized to obtain a probability distribution, using probabilities. The generalized entropy to the given base is then computed over that box visitation distribution using genentropy(::Real, ::AbstractArray).

Example

using DelayEmbeddings, Entropies
D = Dataset(rand(1:3, 20000, 3))

# Estimator specification. Split each coordinate axis in five equal segments.
est = VisitationFrequency(RectangularBinning(5)) 

# Estimate order-1 (default) generalized entropy
Entropies.genentropy(D, est, base = 2)

See also: VisitationFrequency.

source
  • Rényi1960A. Rényi, Proceedings of the fourth Berkeley Symposium on Mathematics, Statistics and Probability, pp 547 (1960)
  • Shannon1948C. E. Shannon, Bell Systems Technical Journal 27, pp 379 (1948)
  • Rényi1960A. Rényi, Proceedings of the fourth Berkeley Symposium on Mathematics, Statistics and Probability, pp 547 (1960)
  • Rényi1960A. Rényi, Proceedings of the fourth Berkeley Symposium on Mathematics, Statistics and Probability, pp 547 (1960)