Entropies

Entropies API

The entropies API is defined by

Please be sure you have read the Terminology section before going through the API here, to have a good idea of the different "flavors" of entropies and how they all come together over the common interface of the entropy function.

Entropy definitions

ComplexityMeasures.EntropyDefinitionType
EntropyDefinition

EntropyDefinition is the supertype of all types that encapsulate definitions of (generalized) entropies. These also serve as estimators of discrete entropies, see description below.

Currently implemented entropy definitions are:

These types can be given as inputs to entropy or entropy_normalized.

Description

Mathematically speaking, generalized entropies are just nonnegative functions of probability distributions that verify certain (entropy-type-dependent) axioms. Amigó et al.'s[Amigó2018] summary paper gives a nice overview.

However, for a software implementation computing entropies in practice, definitions is not really what matters; estimators matter. Because in the practical sense, one needs to estimate a definition from finite data, and different ways of estimating a quantity come with their own pros and cons.

That is why the type DiscreteEntropyEstimator exists, which is what is actually given to entropy. Some ways to estimate a discrete entropy only apply to a specific entropy definition. For estimators that can be applied to various entropy definitions, this is specified by providing an instance of EntropyDefinition to the estimator.

source
ComplexityMeasures.ShannonType
Shannon <: EntropyDefinition
Shannon(; base = 2)

The Shannon[Shannon1948] entropy, used with entropy to compute:

\[H(p) = - \sum_i p[i] \log(p[i])\]

with the $\log$ at the given base.

The maximum value of the Shannon entropy is $\log_{base}(L)$, which is the entropy of the uniform distribution with $L$ the total_outcomes.

source
ComplexityMeasures.RenyiType
Renyi <: EntropyDefinition
Renyi(q, base = 2)
Renyi(; q = 1.0, base = 2)

The Rényi[Rényi1960] generalized order-q entropy, used with entropy to compute an entropy with units given by base (typically 2 or MathConstants.e).

Description

Let $p$ be an array of probabilities (summing to 1). Then the Rényi generalized entropy is

\[H_q(p) = \frac{1}{1-q} \log \left(\sum_i p[i]^q\right)\]

and generalizes other known entropies, like e.g. the information entropy ($q = 1$, see [Shannon1948]), the maximum entropy ($q=0$, also known as Hartley entropy), or the correlation entropy ($q = 2$, also known as collision entropy).

The maximum value of the Rényi entropy is $\log_{base}(L)$, which is the entropy of the uniform distribution with $L$ the total_outcomes.

source
ComplexityMeasures.TsallisType
Tsallis <: EntropyDefinition
Tsallis(q; k = 1.0, base = 2)
Tsallis(; q = 1.0, k = 1.0, base = 2)

The Tsallis[Tsallis1988] generalized order-q entropy, used with entropy to compute an entropy.

base only applies in the limiting case q == 1, in which the Tsallis entropy reduces to Shannon entropy.

Description

The Tsallis entropy is a generalization of the Boltzmann-Gibbs entropy, with k standing for the Boltzmann constant. It is defined as

\[S_q(p) = \frac{k}{q - 1}\left(1 - \sum_{i} p[i]^q\right)\]

The maximum value of the Tsallis entropy is ``$k(L^{1 - q} - 1)/(1 - q)$, with $L$ the total_outcomes.

source
ComplexityMeasures.KaniadakisType
Kaniadakis <: EntropyDefinition
Kaniadakis(; κ = 1.0, base = 2.0)

The Kaniadakis entropy (Tsallis, 2009)[Tsallis2009], used with entropy to compute

\[H_K(p) = -\sum_{i=1}^N p_i f_\kappa(p_i),\]

\[f_\kappa (x) = \dfrac{x^\kappa - x^{-\kappa}}{2\kappa},\]

where if $\kappa = 0$, regular logarithm to the given base is used, and 0 probabilities are skipped.

source
ComplexityMeasures.CuradoType
Curado <: EntropyDefinition
Curado(; b = 1.0)

The Curado entropy (Curado & Nobre, 2004)[Curado2004], used with entropy to compute

\[H_C(p) = \left( \sum_{i=1}^N e^{-b p_i} \right) + e^{-b} - 1,\]

with b ∈ ℛ, b > 0, where the terms outside the sum ensures that $H_C(0) = H_C(1) = 0$.

The maximum entropy for Curado is $L(1 - \exp(-b/L)) + \exp(-b) - 1$ with $L$ the total_outcomes.

source
ComplexityMeasures.StretchedExponentialType
StretchedExponential <: EntropyDefinition
StretchedExponential(; η = 2.0, base = 2)

The stretched exponential, or Anteneodo-Plastino, entropy (Anteneodo & Plastino, 1999[Anteneodo1999]), used with entropy to compute

\[S_{\eta}(p) = \sum_{i = 1}^N \Gamma \left( \dfrac{\eta + 1}{\eta}, - \log_{base}(p_i) \right) - p_i \Gamma \left( \dfrac{\eta + 1}{\eta} \right),\]

where $\eta \geq 0$, $\Gamma(\cdot, \cdot)$ is the upper incomplete Gamma function, and $\Gamma(\cdot) = \Gamma(\cdot, 0)$ is the Gamma function. Reduces to Shannon entropy for η = 1.0.

The maximum entropy for StrechedExponential is a rather complicated expression involving incomplete Gamma functions (see source code).

source

Discrete entropy

ComplexityMeasures.entropyMethod
entropy([e::DiscreteEntropyEstimator,] probs::Probabilities)
entropy([e::DiscreteEntropyEstimator,] est::ProbabilitiesEstimator, x)

Compute the discrete entropy h::Real ∈ [0, ∞), using the estimator e, in one of two ways:

  1. Directly from existing Probabilities probs.
  2. From input data x, by first estimating a probability mass function using the provided ProbabilitiesEstimator, and then computing the entropy from that mass fuction using the provided DiscreteEntropyEstimator.

Instead of providing a DiscreteEntropyEstimator, an EntropyDefinition can be given directly, in which case MLEntropy is used as the estimator. If e is not provided, Shannon() is used by default.

Maximum entropy and normalized entropy

All discrete entropies have a well defined maximum value for a given probability estimator. To obtain this value one only needs to call the entropy_maximum. Or, one can use entropy_normalized to obtain the normalized form of the entropy (divided by the maximum).

Examples

x = [rand(Bool) for _ in 1:10000] # coin toss
ps = probabilities(x) # gives about [0.5, 0.5] by definition
h = entropy(ps) # gives 1, about 1 bit by definition
h = entropy(Shannon(), ps) # syntactically equivalent to above
h = entropy(Shannon(), CountOccurrences(x), x) # syntactically equivalent to above
h = entropy(SymbolicPermutation(;m=3), x) # gives about 2, again by definition
h = entropy(Renyi(2.0), ps) # also gives 1, order `q` doesn't matter for coin toss
source
ComplexityMeasures.entropy_maximumFunction
entropy_maximum(e::EntropyDefinition, est::ProbabilitiesEstimator, x)

Return the maximum value of a discrete entropy with the given probabilities estimator and input data x. Like in outcome_space, for some estimators the concrete outcome space is known without knowledge of input x, in which case the function dispatches to entropy_maximum(e, est).

entropy_maximum(e::EntropyDefinition, L::Int)

Same as above, but computed directly from the number of total outcomes L.

source
ComplexityMeasures.entropy_normalizedFunction
entropy_normalized([e::DiscreteEntropyEstimator,] est::ProbabilitiesEstimator, x) → h̃

Return h̃ ∈ [0, 1], the normalized discrete entropy of x, i.e. the value of entropy divided by the maximum value for e, according to the given probabilities estimator.

Instead of a discrete entropy estimator, an EntropyDefinition can be given as first argument. If e is not given, it defaults to Shannon().

Notice that there is no method entropy_normalized(e::DiscreteEntropyEstimator, probs::Probabilities), because there is no way to know the amount of possible events (i.e., the total_outcomes) from probs.

source

Discrete entropy estimators

ComplexityMeasures.MLEntropyType
MLEntropy(e::EntropyDefinition) <: DiscreteEntropyEstimator

Standing for "maximum likelihood entropy", and also called empirical/naive/plug-in, this estimator calculates the entropy exactly as defined in the given EntropyDefinition directly from a probability mass function.

source

Differential entropy

ComplexityMeasures.entropyMethod
entropy(est::DifferentialEntropyEstimator, x) → h

Approximate the differential entropy h::Real using the provided DifferentialEntropyEstimator and input data x. This method doesn't involve explicitly computing (discretized) probabilities first.

The overwhelming majority of entropy estimators estimate the Shannon entropy. If some estimator can estimate different definitions of entropy (e.g., Tsallis), this is provided as an argument to the estimator itself.

See the table of differential entropy estimators in the docs for all differential entropy estimators.

Examples

A standard normal distribution has a base-e differential entropy of 0.5*log(2π) + 0.5 nats.

est = Kraskov(k = 5, base = ℯ) # Base `ℯ` for nats.
h = entropy(est, randn(1_000_000))
abs(h - 0.5*log(2π) - 0.5) # ≈ 0.001
source

Table of differential entropy estimators

The following estimators are differential entropy estimators, and can also be used with entropy.

Each DifferentialEntropyEstimators uses a specialized technique to approximate relevant densities/integrals, and is often tailored to one or a few types of generalized entropy. For example, Kraskov estimates the Shannon entropy.

EstimatorPrincipleInput dataShannonRenyiTsallisKaniadakisCuradoStretchedExponential
KozachenkoLeonenkoNearest neighborsStateSpaceSetxxxxx
KraskovNearest neighborsStateSpaceSetxxxxx
ZhuNearest neighborsStateSpaceSetxxxxx
ZhuSinghNearest neighborsStateSpaceSetxxxxx
GaoNearest neighborsStateSpaceSetxxxxx
GoriaNearest neighborsStateSpaceSetxxxxx
LordNearest neighborsStateSpaceSetxxxxx
VasicekOrder statisticsVectorxxxxx
EbrahimiOrder statisticsVectorxxxxx
CorreaOrder statisticsVectorxxxxx
AlizadehArghamiOrder statisticsVectorxxxxx
ComplexityMeasures.KraskovType
Kraskov <: DiffEntropyEst
Kraskov(; k::Int = 1, w::Int = 1, base = 2)

The Kraskov estimator computes the Shannon differential entropy of a multi-dimensional StateSpaceSet using the k-th nearest neighbor searches method from [Kraskov2004] at the given base.

w is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0, meaning that only the point itself is excluded when searching for neighbours).

Description

Assume we have samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function$f : \mathbb{R}^d \to \mathbb{R}$. Kraskov estimates the Shannon differential entropy

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]

See also: entropy, KozachenkoLeonenko, DifferentialEntropyEstimator.

source
ComplexityMeasures.KozachenkoLeonenkoType
KozachenkoLeonenko <: DiffEntropyEst
KozachenkoLeonenko(; w::Int = 0, base = 2)

The KozachenkoLeonenko estimator computes the Shannon differential entropy of a multi-dimensional StateSpaceSet in the given base.

Description

Assume we have samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function$f : \mathbb{R}^d \to \mathbb{R}$. KozachenkoLeonenko estimates the Shannon differential entropy

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))]\]

using the nearest neighbor method from Kozachenko & Leonenko (1987)[KozachenkoLeonenko1987], as described in Charzyńska and Gambin[Charzyńska2016].

w is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0, meaning that only the point itself is excluded when searching for neighbours).

In contrast to Kraskov, this estimator uses only the closest neighbor.

See also: entropy, Kraskov, DifferentialEntropyEstimator.

source
ComplexityMeasures.ZhuType
Zhu <: DiffEntropyEst
Zhu(; k = 1, w = 0, base = 2)

The Zhu estimator (Zhu et al., 2015)[Zhu2015] is an extension to KozachenkoLeonenko, and computes the Shannon differential entropy of a multi-dimensional StateSpaceSet in the given base.

Description

Assume we have samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function$f : \mathbb{R}^d \to \mathbb{R}$. Zhu estimates the Shannon differential entropy

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))]\]

by approximating densities within hyperrectangles surrounding each point xᵢ ∈ x using using k nearest neighbor searches. w is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0, meaning that only the point itself is excluded when searching for neighbours).

See also: entropy, KozachenkoLeonenko, DifferentialEntropyEstimator.

source
ComplexityMeasures.ZhuSinghType
ZhuSingh <: DiffEntropyEst
ZhuSingh(; k = 1, w = 0, base = 2)

The ZhuSingh estimator (Zhu et al., 2015)[Zhu2015] computes the Shannon differential entropy of a multi-dimensional StateSpaceSet in the given base.

Description

Assume we have samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function$f : \mathbb{R}^d \to \mathbb{R}$. ZhuSingh estimates the Shannon differential entropy

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]

Like Zhu, this estimator approximates probabilities within hyperrectangles surrounding each point xᵢ ∈ x using using k nearest neighbor searches. However, it also considers the number of neighbors falling on the borders of these hyperrectangles. This estimator is an extension to the entropy estimator in Singh et al. (2003).

w is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0, meaning that only the point itself is excluded when searching for neighbours).

See also: entropy, DifferentialEntropyEstimator.

source
ComplexityMeasures.GaoType
Gao <: DifferentialEntropyEstimator
Gao(; k = 1, w = 0, base = 2, corrected = true)

The Gao estimator (Gao et al., 2015) computes the Shannon differential entropy, using a k-th nearest-neighbor approach based on Singh et al. (2003)[Singh2003].

w is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0, meaning that only the point itself is excluded when searching for neighbours).

Gao et al., 2015 give two variants of this estimator. If corrected == false, then the uncorrected version is used. If corrected == true, then the corrected version is used, which ensures that the estimator is asymptotically unbiased.

Description

Assume we have samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function$f : \mathbb{R}^d \to \mathbb{R}$. KozachenkoLeonenko estimates the Shannon differential entropy

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))]\]

source
ComplexityMeasures.GoriaType
Goria <: DifferentialEntropyEstimator
Goria(; k = 1, w = 0, base = 2)

The Goria estimator computes the Shannon differential entropy of a multi-dimensional StateSpaceSet in the given base.

Description

Assume we have samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function$f : \mathbb{R}^d \to \mathbb{R}$. Goria estimates the Shannon differential entropy

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]

Specifically, let $\bf{n}_1, \bf{n}_2, \ldots, \bf{n}_N$ be the distance of the samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ to their k-th nearest neighbors. Next, let the geometric mean of the distances be

\[\hat{\rho}_k = \left( \prod_{i=1}^N \right)^{\dfrac{1}{N}}\]

Goria et al. (2005)[Goria2005]'s estimate of Shannon differential entropy is then

\[\hat{H} = m\hat{\rho}_k + \log(N - 1) - \psi(k) + \log c_1(m),\]

where $c_1(m) = \dfrac{2\pi^\frac{m}{2}}{m \Gamma(m/2)}$ and $\psi$ is the digamma function.

source
ComplexityMeasures.LordType
Lord <: DifferentialEntropyEstimator
Lord(; k = 10, w = 0, base = 2)

Lord estimates the Shannon differential entropy using a nearest neighbor approach with a local nonuniformity correction (LNC).

w is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0, meaning that only the point itself is excluded when searching for neighbours).

Description

Assume we have samples $\bar{X} = \{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function $f : \mathbb{R}^d \to \mathbb{R}$. Lord estimates the Shannon differential entropy

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))],\]

by using the resubstitution formula

\[\hat{\bar{X}, k} = -\mathbb{E}[\log(f(X))] \approx \sum_{i = 1}^N \log(\hat{f}(\bf{x}_i)),\]

where $\hat{f}(\bf{x}_i)$ is an estimate of the density at $\bf{x}_i$ constructed in a manner such that $\hat{f}(\bf{x}_i) \propto \dfrac{k(x_i) / N}{V_i}$, where $k(x_i)$ is the number of points in the neighborhood of $\bf{x}_i$, and $V_i$ is the volume of that neighborhood.

While most nearest-neighbor based differential entropy estimators uses regular volume elements (e.g. hypercubes, hyperrectangles, hyperspheres) for approximating the local densities $\hat{f}(\bf{x}_i)$, the Lord estimator uses hyperellopsoid volume elements. These hyperellipsoids are, for each query point xᵢ, estimated using singular value decomposition (SVD) on the k-th nearest neighbors of xᵢ. Thus, the hyperellipsoids stretch/compress in response to the local geometry around each sample point. This makes Lord a well-suited entropy estimator for a wide range of systems.

source
ComplexityMeasures.VasicekType
Vasicek <: DiffEntropyEst
Vasicek(; m::Int = 1, base = 2)

The Vasicek estimator computes the Shannon differential entropy (in the given base) of a timeseries using the method from Vasicek (1976)[Vasicek1976].

The Vasicek estimator belongs to a class of differential entropy estimators based on order statistics, of which Vasicek (1976) was the first. It only works for timeseries input.

Description

Assume we have samples $\bar{X} = \{x_1, x_2, \ldots, x_N \}$ from a continuous random variable $X \in \mathbb{R}$ with support $\mathcal{X}$ and density function$f : \mathbb{R} \to \mathbb{R}$. Vasicek estimates the Shannon differential entropy

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]

However, instead of estimating the above integral directly, it makes use of the equivalent integral, where $F$ is the distribution function for $X$,

\[H(X) = \int_0^1 \log \left(\dfrac{d}{dp}F^{-1}(p) \right) dp\]

This integral is approximated by first computing the order statistics of $\bar{X}$ (the input timeseries), i.e. $x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}$. The Vasicek Shannon differential entropy estimate is then

\[\hat{H}_V(\bar{X}, m) = \dfrac{1}{n} \sum_{i = 1}^n \log \left[ \dfrac{n}{2m} (\bar{X}_{(i+m)} - \bar{X}_{(i-m)}) \right]\]

Usage

In practice, choice of m influences how fast the entropy converges to the true value. For small value of m, convergence is slow, so we recommend to scale m according to the time series length n and use m >= n/100 (this is just a heuristic based on the tests written for this package).

See also: entropy, Correa, AlizadehArghami, Ebrahimi, DifferentialEntropyEstimator.

source
ComplexityMeasures.AlizadehArghamiType
AlizadehArghami <: DiffEntropyEst
AlizadehArghami(; m::Int = 1, base = 2)

The AlizadehArghamiestimator computes the Shannon differential entropy (in the given base) of a timeseries using the method from Alizadeh & Arghami (2010)[Alizadeh2010].

The AlizadehArghami estimator belongs to a class of differential entropy estimators based on order statistics. It only works for timeseries input.

Description

Assume we have samples $\bar{X} = \{x_1, x_2, \ldots, x_N \}$ from a continuous random variable $X \in \mathbb{R}$ with support $\mathcal{X}$ and density function$f : \mathbb{R} \to \mathbb{R}$. AlizadehArghami estimates the Shannon differential entropy

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]

However, instead of estimating the above integral directly, it makes use of the equivalent integral, where $F$ is the distribution function for $X$:

\[H(X) = \int_0^1 \log \left(\dfrac{d}{dp}F^{-1}(p) \right) dp.\]

This integral is approximated by first computing the order statistics of $\bar{X}$ (the input timeseries), i.e. $x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}$. The AlizadehArghami Shannon differential entropy estimate is then the the Vasicek estimate $\hat{H}_{V}(\bar{X}, m, n)$, plus a correction factor

\[\hat{H}_{A}(\bar{X}, m, n) = \hat{H}_{V}(\bar{X}, m, n) + \dfrac{2}{n}\left(m \log(2) \right).\]

See also: entropy, Correa, Ebrahimi, Vasicek, DifferentialEntropyEstimator.

source
ComplexityMeasures.EbrahimiType
Ebrahimi <: DiffEntropyEst
Ebrahimi(; m::Int = 1, base = 2)

The Ebrahimi estimator computes the Shannon entropy (in the given base) of a timeseries using the method from Ebrahimi (1994)[Ebrahimi1994].

The Ebrahimi estimator belongs to a class of differential entropy estimators based on order statistics. It only works for timeseries input.

Description

Assume we have samples $\bar{X} = \{x_1, x_2, \ldots, x_N \}$ from a continuous random variable $X \in \mathbb{R}$ with support $\mathcal{X}$ and density function$f : \mathbb{R} \to \mathbb{R}$. Ebrahimi estimates the Shannon differential entropy

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]

However, instead of estimating the above integral directly, it makes use of the equivalent integral, where $F$ is the distribution function for $X$,

\[H(X) = \int_0^1 \log \left(\dfrac{d}{dp}F^{-1}(p) \right) dp\]

This integral is approximated by first computing the order statistics of $\bar{X}$ (the input timeseries), i.e. $x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}$. The Ebrahimi Shannon differential entropy estimate is then

\[\hat{H}_{E}(\bar{X}, m) = \dfrac{1}{n} \sum_{i = 1}^n \log \left[ \dfrac{n}{c_i m} (\bar{X}_{(i+m)} - \bar{X}_{(i-m)}) \right],\]

where

\[c_i = \begin{cases} 1 + \frac{i - 1}{m}, & 1 \geq i \geq m \\ 2, & m + 1 \geq i \geq n - m \\ 1 + \frac{n - i}{m} & n - m + 1 \geq i \geq n \end{cases}.\]

See also: entropy, Correa, AlizadehArghami, Vasicek, DifferentialEntropyEstimator.

source
ComplexityMeasures.CorreaType
Correa <: DiffEntropyEst
Correa(; m::Int = 1, base = 2)

The Correa estimator computes the Shannon differential entropy (in the given `base) of a timeseries using the method from Correa (1995)[Correa1995].

The Correa estimator belongs to a class of differential entropy estimators based on order statistics. It only works for timeseries input.

Description

Assume we have samples $\bar{X} = \{x_1, x_2, \ldots, x_N \}$ from a continuous random variable $X \in \mathbb{R}$ with support $\mathcal{X}$ and density function$f : \mathbb{R} \to \mathbb{R}$. Correa estimates the Shannon differential entropy

\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]

However, instead of estimating the above integral directly, Correa makes use of the equivalent integral, where $F$ is the distribution function for $X$,

\[H(X) = \int_0^1 \log \left(\dfrac{d}{dp}F^{-1}(p) \right) dp\]

This integral is approximated by first computing the order statistics of $\bar{X}$ (the input timeseries), i.e. $x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}$, ensuring that end points are included. The Correa estimate of Shannon differential entropy is then

\[H_C(\bar{X}, m, n) = \dfrac{1}{n} \sum_{i = 1}^n \log \left[ \dfrac{ \sum_{j=i-m}^{i+m}(\bar{X}_{(j)} - \tilde{X}_{(i)})(j - i)}{n \sum_{j=i-m}^{i+m} (\bar{X}_{(j)} - \tilde{X}_{(i)})^2} \right],\]

where

\[\tilde{X}_{(i)} = \dfrac{1}{2m + 1} \sum_{j = i - m}^{i + m} X_{(j)}.\]

See also: entropy, AlizadehArghami, Ebrahimi, Vasicek, DifferentialEntropyEstimator.

source
  • Amigó2018Amigó, J. M., Balogh, S. G., & Hernández, S. (2018). A brief review of generalized entropies. Entropy, 20(11), 813.
  • Shannon1948C. E. Shannon, Bell Systems Technical Journal 27, pp 379 (1948)
  • Rényi1960A. Rényi, Proceedings of the fourth Berkeley Symposium on Mathematics, Statistics and Probability, pp 547 (1960)
  • Shannon1948C. E. Shannon, Bell Systems Technical Journal 27, pp 379 (1948)
  • Tsallis1988Tsallis, C. (1988). Possible generalization of Boltzmann-Gibbs statistics. Journal of statistical physics, 52(1), 479-487.
  • Tsallis2009Tsallis, C. (2009). Introduction to nonextensive statistical mechanics: approaching a complex world. Springer, 1(1), 2-1.
  • Curado2004Curado, E. M., & Nobre, F. D. (2004). On the stability of analytic entropic forms. Physica A: Statistical Mechanics and its Applications, 335(1-2), 94-106.
  • Anteneodo1999Anteneodo, C., & Plastino, A. R. (1999). Maximum entropy approach to stretched exponential probability distributions. Journal of Physics A: Mathematical and General, 32(7), 1089.
  • Kraskov2004Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical review E, 69(6), 066138.
  • Charzyńska2016Charzyńska, A., & Gambin, A. (2016). Improvement of the k-NN entropy estimator with applications in systems biology. EntropyDefinition, 18(1), 13.
  • KozachenkoLeonenko1987Kozachenko, L. F., & Leonenko, N. N. (1987). Sample estimate of the entropy of a random vector. Problemy Peredachi Informatsii, 23(2), 9-16.
  • Zhu2015Zhu, J., Bellanger, J. J., Shu, H., & Le Bouquin Jeannès, R. (2015). Contribution to transfer entropy estimation via the k-nearest-neighbors approach. EntropyDefinition, 17(6), 4173-4201.
  • Zhu2015Zhu, J., Bellanger, J. J., Shu, H., & Le Bouquin Jeannès, R. (2015). Contribution to transfer entropy estimation via the k-nearest-neighbors approach. EntropyDefinition, 17(6), 4173-4201.
  • Singh2003Singh, H., Misra, N., Hnizdo, V., Fedorowicz, A., & Demchuk, E. (2003). Nearest neighbor estimates of entropy. American journal of mathematical and management sciences, 23(3-4), 301-321.
  • Gao2015Gao, S., Ver Steeg, G., & Galstyan, A. (2015, February). Efficient estimation of mutual information for strongly dependent variables. In Artificial intelligence and statistics (pp. 277-286). PMLR.
  • Singh2003Singh, H., Misra, N., Hnizdo, V., Fedorowicz, A., & Demchuk, E. (2003). Nearest neighbor estimates of entropy. American journal of mathematical and management sciences, 23(3-4), 301-321.
  • Goria2005Goria, M. N., Leonenko, N. N., Mergel, V. V., & Novi Inverardi, P. L. (2005). A new class of random vector entropy estimators and its applications in testing statistical hypotheses. Journal of Nonparametric Statistics, 17(3), 277-297.
  • Lord2015Lord, W. M., Sun, J., & Bollt, E. M. (2018). Geometric k-nearest neighbor estimation of entropy and mutual information. Chaos: An Interdisciplinary Journal of Nonlinear Science, 28(3), 033114.
  • Vasicek1976Vasicek, O. (1976). A test for normality based on sample entropy. Journal of the Royal Statistical Society: Series B (Methodological), 38(1), 54-59.
  • Alizadeh2010Alizadeh, N. H., & Arghami, N. R. (2010). A new estimator of entropy. Journal of the Iranian Statistical Society (JIRSS).
  • Ebrahimi1994Ebrahimi, N., Pflughoeft, K., & Soofi, E. S. (1994). Two measures of sample entropy. Statistics & Probability Letters, 20(3), 225-234.
  • Correa1995Correa, J. C. (1995). A new estimator of entropy. Communications in Statistics-Theory and Methods, 24(10), 2439-2449.