Generalized entropy
Entropies.genentropy
— Functiongenentropy(α::Real, p::AbstractArray; base = Base.MathConstants.e)
Compute the entropy, to the given base
, of an array of probabilities p
, assuming that p
is sum-normalized.
Description
Let $p$ be an array of probabilities (summing to 1). Then the Rényi entropy is
and generalizes other known entropies, like e.g. the information entropy ($\alpha = 1$, see [Shannon1948]), the maximum entropy ($\alpha=0$, also known as Hartley entropy), or the correlation entropy ($\alpha = 2$, also known as collision entropy).
Example
using Entropies
p = rand(5000)
p = p ./ sum(p) # normalizing to 1 ensures we have a probability distribution
# Estimate order-1 generalized entropy to base 2 of the distribution
Entropies.genentropy(1, ps, base = 2)
See also: non0hist
.
Permutation entropy
genentropy(x::Dataset, est::SymbolicPermutation, α::Real = 1; base = 2) → Real
genentropy(x::AbstractVector{<:Real}, est::SymbolicPermutation, α::Real = 1; m::Int = 3, τ::Int = 1, base = 2) → Real
genentropy!(s::Vector{Int}, x::Dataset, est::SymbolicPermutation, α::Real = 1; base = 2) → Real
genentropy!(s::Vector{Int}, x::AbstractVector{<:Real}, est::SymbolicPermutation, α::Real = 1; m::Int = 3, τ::Int = 1, base = 2) → Real
Compute the generalized order-α
entropy over a permutation symbolization of x
, using symbol size/order m
.
If x
is a multivariate Dataset
, then symbolization is performed directly on the state vectors. If x
is a univariate signal, then a delay reconstruction with embedding lag τ
and embedding dimension m
is used to construct state vectors, on which symbolization is then performed.
A pre-allocated symbol array s
can be provided to save some memory allocations if probabilities are to be computed for multiple data sets. If provided, it is required that length(x) == length(s)
if x
is a Dataset
, or length(s) == length(x) - (m-1)τ
if x
is a univariate signal.
Probability estimation
An unordered symbol frequency histogram is obtained by symbolizing the points in x
, using probabilities(::Dataset, ::SymbolicPermutation)
. Sum-normalizing this histogram yields a probability distribution over the symbols.
Entropy estimation
After the symbolization histogram/distribution has been obtained, the order α
generalized entropy[Rényi1960], to the given base
, is computed from that sum-normalized symbol distribution, using genentropy
.
Notes
Do not confuse the order of the generalized entropy (α
) with the order m
of the permutation entropy (m
, which controls the symbol size). Permutation entropy is usually estimated with α = 1
, but the implementation here allows the generalized entropy of any dimension to be computed from the symbol frequency distribution.
See also: SymbolicPermutation
, genentropy
.
Weighted permutation entropy
genentropy(x::Dataset, est::SymbolicWeightedPermutation, α::Real = 1; base = 2) → Real
genentropy(x::AbstractVector{<:Real}, est::SymbolicWeightedPermutation, α::Real = 1; m::Int = 3, τ::Int = 1, base = 2) → Real
Compute the generalized order α
entropy based on a weighted permutation symbolization of x
, using symbol size/order m
for the permutations.
If x
is a multivariate Dataset
, then symbolization is performed directly on the state vectors. If x
is a univariate signal, then a delay reconstruction with embedding lag τ
and embedding dimension m
is used to construct state vectors, on which symbolization is then performed.
Probability estimation
An unordered symbol frequency histogram is obtained by symbolizing the points in x
by a weighted procedure, using probabilities(::Dataset, ::SymbolicWeightedPermutation)
. Sum-normalizing this histogram yields a probability distribution over the weighted symbols.
Entropy estimation
After the symbolization histogram/distribution has been obtained, the order α
generalized entropy[Rényi1960], to the given base
, is computed from that sum-normalized symbol distribution, using genentropy
.
Notes
Do not confuse the order of the generalized entropy (α
) with the order m
of the permutation entropy (m
, which controls the symbol size). Permutation entropy is usually estimated with α = 1
, but the implementation here allows the generalized entropy of any dimension to be computed from the symbol frequency distribution.
See also: SymbolicWeightedPermutation
, genentropy
.
Visitation frequency, binning based entropy
genentropy(x::Dataset, est::VisitationFrequency, α::Real = 1; base::Real = 2)
Compute the order-α
generalized (Rényi) entropy[Rényi1960] of a multivariate dataset x
using a visitation frequency approach.
Description
First, the state space defined by x
is partitioned into rectangular boxes according to the binning instructions given by est.binning
. Then, a histogram of visitations to each of those boxes is obtained, which is then sum-normalized to obtain a probability distribution, using probabilities
. The generalized entropy to the given base
is then computed over that box visitation distribution using genentropy(::Real, ::AbstractArray)
.
Example
using DelayEmbeddings, Entropies
D = Dataset(rand(1:3, 20000, 3))
# Estimator specification. Split each coordinate axis in five equal segments.
est = VisitationFrequency(RectangularBinning(5))
# Estimate order-1 (default) generalized entropy
Entropies.genentropy(D, est, base = 2)
See also: VisitationFrequency
.
- Rényi1960A. Rényi, Proceedings of the fourth Berkeley Symposium on Mathematics, Statistics and Probability, pp 547 (1960)
- Shannon1948C. E. Shannon, Bell Systems Technical Journal 27, pp 379 (1948)
- Rényi1960A. Rényi, Proceedings of the fourth Berkeley Symposium on Mathematics, Statistics and Probability, pp 547 (1960)
- Rényi1960A. Rényi, Proceedings of the fourth Berkeley Symposium on Mathematics, Statistics and Probability, pp 547 (1960)