Entropies
Entropies API
The entropies API is defined by
Please be sure you have read the Terminology section before going through the API here, to have a good idea of the different "flavors" of entropies and how they all come together over the common interface of the entropy
function.
Entropy definitions
ComplexityMeasures.EntropyDefinition
— TypeEntropyDefinition
EntropyDefinition
is the supertype of all types that encapsulate definitions of (generalized) entropies. Currently implemented entropy definitions are:
Renyi
.Tsallis
.Shannon
, which is a subcase of the above two in the limitq → 1
.Kaniadakis
.Curado
.StretchedExponential
.
These entropy types are given as inputs to entropy
and entropy_normalized
. Notice that in all documentation strings formulas are provided for the discrete version of the entropy, for simplicity.
See entropy
for usage.
Description
Mathematically speaking, generalized entropies are just nonnegative functions of probability distributions that verify certain (entropy-type-dependent) axioms. Amigó et al.'s[Amigó2018] summary paper gives a nice overview.
ComplexityMeasures.Shannon
— TypeShannon <: EntropyDefinition
Shannon(; base = 2)
The Shannon[Shannon1948] entropy, used with entropy
to compute:
\[H(p) = - \sum_i p[i] \log(p[i])\]
with the $\log$ at the given base
.
ComplexityMeasures.Renyi
— TypeRenyi <: EntropyDefinition
Renyi(q, base = 2)
Renyi(; q = 1.0, base = 2)
The Rényi[Rényi1960] generalized order-q
entropy, used with entropy
to compute an entropy with units given by base
(typically 2
or MathConstants.e
).
Description
Let $p$ be an array of probabilities (summing to 1). Then the Rényi generalized entropy is
\[H_q(p) = \frac{1}{1-q} \log \left(\sum_i p[i]^q\right)\]
and generalizes other known entropies, like e.g. the information entropy ($q = 1$, see [Shannon1948]), the maximum entropy ($q=0$, also known as Hartley entropy), or the correlation entropy ($q = 2$, also known as collision entropy).
The maximum value of the Rényi entropy is $\log_{base}(L)$, which is the entropy of the uniform distribution with $L$ the total_outcomes
.
ComplexityMeasures.Tsallis
— TypeTsallis <: EntropyDefinition
Tsallis(q; k = 1.0, base = 2)
Tsallis(; q = 1.0, k = 1.0, base = 2)
The Tsallis[Tsallis1988] generalized order-q
entropy, used with entropy
to compute an entropy.
base
only applies in the limiting case q == 1
, in which the Tsallis entropy reduces to Shannon entropy.
Description
The Tsallis entropy is a generalization of the Boltzmann-Gibbs entropy, with k
standing for the Boltzmann constant. It is defined as
\[S_q(p) = \frac{k}{q - 1}\left(1 - \sum_{i} p[i]^q\right)\]
If the probability estimator has known alphabet length $L$, then the maximum value of the Tsallis entropy is $k(L^{1 - q} - 1)/(1 - q)$.
ComplexityMeasures.Kaniadakis
— TypeKaniadakis <: EntropyDefinition
Kaniadakis(; κ = 1.0, base = 2.0)
The Kaniadakis entropy (Tsallis, 2009)[Tsallis2009], used with entropy
to compute
\[H_K(p) = -\sum_{i=1}^N p_i\log_\kappa^K(p_i),\]
\[\log_\kappa = \dfrac{x^\kappa - x^{-\kappa}}{2\kappa},\]
where if $\kappa = 0$, regular logarithm to the given base
is used, and log(0) = 0
.
ComplexityMeasures.Curado
— TypeCurado <: EntropyDefinition
Curado(; b = 1.0)
The Curado entropy (Curado & Nobre, 2004)[Curado2004], used with entropy
to compute
\[H_C(p) = \left( \sum_{i=1}^N e^{-b p_i} \right) + e^{-b} - 1,\]
with b ∈ ℛ, b > 0
, where the terms outside the sum ensures that $H_C(0) = H_C(1) = 0$.
The maximum entropy for Curado is $L(1 - \exp(-b/L)) + \exp(-b) - 1$ with $L$ the total_outcomes
.
ComplexityMeasures.StretchedExponential
— TypeStretchedExponential <: EntropyDefinition
StretchedExponential(; η = 2.0, base = 2)
The stretched exponential, or Anteneodo-Plastino, entropy (Anteneodo & Plastino, 1999[Anteneodo1999]), used with entropy
to compute
\[S_{\eta}(p) = \sum_{i = 1}^N \Gamma \left( \dfrac{\eta + 1}{\eta}, - \log_{base}(p_i) \right) - p_i \Gamma \left( \dfrac{\eta + 1}{\eta} \right),\]
where $\eta \geq 0$, $\Gamma(\cdot, \cdot)$ is the upper incomplete Gamma function, and $\Gamma(\cdot) = \Gamma(\cdot, 0)$ is the Gamma function. Reduces to Shannon entropy for η = 1.0
.
The maximum entropy for StrechedExponential
is a rather complicated expression involving incomplete Gamma functions (see source code).
Discrete entropy
ComplexityMeasures.entropy
— Methodentropy([e::EntropyDefinition,] probs::Probabilities)
entropy([e::EntropyDefinition,] est::ProbabilitiesEstimator, x)
Compute the discrete entropy h::Real ∈ [0, ∞)
defined by e
, in one of two ways:
- Directly from existing
Probabilities
probs
. - From input data
x
, by first estimating a probability distribution using the providedProbabilitiesEstimator
, then computing entropy from that distribution. In fact, the second method is just a 2-lines-of-code wrapper that callsprobabilities
and gives the result to the first method.
The entropy definition (first argument) is optional. Explicitly provide e
if you need to specify a logarithm base for the entropy. When est
is a probability estimator, Shannon()
is used by default.
Maximum entropy and normalized entropy
All discrete entropies e
have a well defined maximum value for a given probability estimator. To obtain this value one only needs to call the entropy_maximum
function with the chosen entropy type and probability estimator. Or, one can use entropy_normalized
to obtain the normalized form of the entropy (divided by the maximum).
Examples
x = [rand(Bool) for _ in 1:10000] # coin toss
ps = probabilities(x) # gives about [0.5, 0.5] by definition
h = entropy(ps) # gives 1, about 1 bit by definition
h = entropy(Shannon(), ps) # syntactically equivalent to above
h = entropy(Shannon(), CountOccurrences(x), x) # syntactically equivalent to above
h = entropy(SymbolicPermutation(;m=3), x) # gives about 2, again by definition
h = entropy(Renyi(2.0), ps) # also gives 1, order `q` doesn't matter for coin toss
ComplexityMeasures.entropy_maximum
— Functionentropy_maximum(e::EntropyDefinition, est::ProbabilitiesEstimator)
Return the maximum value of a discrete entropy the given probabilities estimator.
entropy_maximum(e::EntropyDefinition, L::Int)
Same as above, but computed thdirectly from the number of total outcomes L
.
ComplexityMeasures.entropy_normalized
— Functionentropy_normalized([e::EntropyDefinition,] est::ProbabilitiesEstimator, x) → h̃
Return h̃ ∈ [0, 1]
, the normalized discrete entropy of x
, i.e. the value of entropy
divided by the maximum value for e
, according to the given probabilities estimator. If e
is not given, it defaults to Shannon()
.
Notice that there is no method entropy_normalized(e::EntropyDefinition, probs::Probabilities)
, because there is no way to know the amount of possible events (i.e., the total_outcomes
) from probs
.
Differential entropy
ComplexityMeasures.entropy
— Methodentropy([e::EntropyDefinition,] est::DifferentialEntropyEstimator, x)
Approximate the differential entropy h::Real
according to the definition e
using the provided DifferentialEntropyEstimator
and input data x
. This method doesn't involve explicitly computing (discretized) probabilities first.
The entropy definition argument is optional. The default entropy type is inferred from the estimator (e.g. Kraskov
estimates the base-2 Shannon differential entropy). The estimators are not compatible with all versions of EntropyDefinition
. See Table of differential entropy estimators in the docs for a table view of the estimators and the compatibilities.
Examples
A standard normal distribution has a base-e differential entropy of 0.5*log(2π) + 0.5
nats.
def = Shannon(; base = ℯ) # Base `ℯ` for nats.
est = Kraskov(k = 5)
h = entropy(def, est, randn(1_000_000))
abs(h - 0.5*log(2π) - 0.5) # ≈ 0.001
Table of differential entropy estimators
The following estimators are differential entropy estimators, and can also be used with entropy
.
Each DifferentialEntropyEstimator
s uses a specialized technique to approximate relevant densities/integrals, and is often tailored to one or a few types of generalized entropy. For example, Kraskov
estimates the Shannon
entropy.
Estimator | Principle | Input data | Shannon | Renyi | Tsallis | Kaniadakis | Curado | StretchedExponential |
---|---|---|---|---|---|---|---|---|
KozachenkoLeonenko | Nearest neighbors | Dataset | ✓ | x | x | x | x | x |
Kraskov | Nearest neighbors | Dataset | ✓ | x | x | x | x | x |
Zhu | Nearest neighbors | Dataset | ✓ | x | x | x | x | x |
ZhuSingh | Nearest neighbors | Dataset | ✓ | x | x | x | x | x |
Goria | Nearest neighbors | Dataset | ✓ | x | x | x | x | x |
Vasicek | Order statistics | Vector | ✓ | x | x | x | x | x |
Ebrahimi | Order statistics | Vector | ✓ | x | x | x | x | x |
Correa | Order statistics | Vector | ✓ | x | x | x | x | x |
AlizadehArghami | Order statistics | Vector | ✓ | x | x | x | x | x |
ComplexityMeasures.DifferentialEntropyEstimator
— TypeDifferentialEntropyEstimator
DiffEntropyEst # alias
The supertype of all differential entropy estimators. These estimators compute an entropy value in various ways that do not involve explicitly estimating a probability distribution.
Currently implemented estimators are:
See entropy
for usage.
ComplexityMeasures.Kraskov
— TypeKraskov <: DiffEntropyEst
Kraskov(; k::Int = 1, w::Int = 1)
The Kraskov
estimator computes the Shannon
differential entropy
of x
(a multi-dimensional Dataset
) using the k
-th nearest neighbor searches method from [Kraskov2004].
w
is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0
, meaning that only the point itself is excluded when searching for neighbours).
Description
Assume we have samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function$f : \mathbb{R}^d \to \mathbb{R}$. Kraskov
estimates the Shannon differential entropy
\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]
See also: entropy
, KozachenkoLeonenko
, DifferentialEntropyEstimator
.
ComplexityMeasures.KozachenkoLeonenko
— TypeKozachenkoLeonenko <: DiffEntropyEst
KozachenkoLeonenko(; k::Int = 1, w::Int = 1)
The KozachenkoLeonenko
estimator computes the Shannon
differential entropy
of x
(a multi-dimensional Dataset
).
Description
Assume we have samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function$f : \mathbb{R}^d \to \mathbb{R}$. KozachenkoLeonenko
estimates the Shannon differential entropy
\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))]\]
using the nearest neighbor method from Kozachenko & Leonenko (1987)[KozachenkoLeonenko1987], as described in Charzyńska and Gambin[Charzyńska2016].
w
is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0
, meaning that only the point itself is excluded when searching for neighbours).
In contrast to Kraskov
, this estimator uses only the closest neighbor.
See also: entropy
, Kraskov
, DifferentialEntropyEstimator
.
ComplexityMeasures.Zhu
— TypeZhu <: DiffEntropyEst
Zhu(k = 1, w = 0)
The Zhu
estimator (Zhu et al., 2015)[Zhu2015] is an extension to KozachenkoLeonenko
, and computes the Shannon
differential entropy
of x
(a multi-dimensional Dataset
).
Description
Assume we have samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function$f : \mathbb{R}^d \to \mathbb{R}$. Zhu
estimates the Shannon differential entropy
\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))]\]
by approximating densities within hyperrectangles surrounding each point xᵢ ∈ x
using using k
nearest neighbor searches. w
is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0
, meaning that only the point itself is excluded when searching for neighbours).
See also: entropy
, KozachenkoLeonenko
, DifferentialEntropyEstimator
.
ComplexityMeasures.ZhuSingh
— TypeZhuSingh <: DiffEntropyEst
ZhuSingh(k = 1, w = 0)
The ZhuSingh
estimator (Zhu et al., 2015)[Zhu2015] computes the Shannon
differential entropy
of x
(a multi-dimensional Dataset
).
Description
Assume we have samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function$f : \mathbb{R}^d \to \mathbb{R}$. ZhuSingh
estimates the Shannon differential entropy
\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]
Like Zhu
, this estimator approximates probabilities within hyperrectangles surrounding each point xᵢ ∈ x
using using k
nearest neighbor searches. However, it also considers the number of neighbors falling on the borders of these hyperrectangles. This estimator is an extension to the entropy estimator in Singh et al. (2003).
w
is the Theiler window, which determines if temporal neighbors are excluded during neighbor searches (defaults to 0
, meaning that only the point itself is excluded when searching for neighbours).
See also: entropy
, DifferentialEntropyEstimator
.
ComplexityMeasures.Goria
— TypeGoria <: DifferentialEntropyEstimator
Goria(; k = 1, w = 0)
The Goria
estimator computes the Shannon
differential entropy
of x
(a multi-dimensional Dataset
).
Description
Assume we have samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ from a continuous random variable $X \in \mathbb{R}^d$ with support $\mathcal{X}$ and density function$f : \mathbb{R}^d \to \mathbb{R}$. Goria
estimates the Shannon differential entropy
\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]
Specifically, let $\bf{n}_1, \bf{n}_2, \ldots, \bf{n}_N$ be the distance of the samples $\{\bf{x}_1, \bf{x}_2, \ldots, \bf{x}_N \}$ to their k
-th nearest neighbors. Next, let the geometric mean of the distances be
\[\hat{\rho}_k = \left( \prod_{i=1}^N \right)^{\dfrac{1}{N}}\]
Goria et al. (2005)[Goria2005]'s estimate of Shannon differential entropy is then
\[\hat{H} = m\hat{\rho}_k + \log(N - 1) - \psi(k) + \log c_1(m),\]
where $c_1(m) = \dfrac{2\pi^\frac{m}{2}}{m \Gamma(m/2)}$ and $\psi$ is the digamma function.
ComplexityMeasures.Vasicek
— TypeVasicek <: DiffEntropyEst
Vasicek(; m::Int = 1)
The Vasicek
estimator computes the Shannon
differential entropy
of x
(a multi-dimensional Dataset
) using the method from Vasicek (1976)[Vasicek1976].
The Vasicek
estimator belongs to a class of differential entropy estimators based on order statistics, of which Vasicek (1976) was the first. It only works for timeseries input.
Description
Assume we have samples $\bar{X} = \{x_1, x_2, \ldots, x_N \}$ from a continuous random variable $X \in \mathbb{R}$ with support $\mathcal{X}$ and density function$f : \mathbb{R} \to \mathbb{R}$. Vasicek
estimates the Shannon differential entropy
\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]
However, instead of estimating the above integral directly, it makes use of the equivalent integral, where $F$ is the distribution function for $X$,
\[H(X) = \int_0^1 \log \left(\dfrac{d}{dp}F^{-1}(p) \right) dp\]
This integral is approximated by first computing the order statistics of $\bar{X}$ (the input timeseries), i.e. $x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}$. The Vasicek
Shannon
differential entropy estimate is then
\[\hat{H}_V(\bar{X}, m) = \dfrac{1}{n} \sum_{i = 1}^n \log \left[ \dfrac{n}{2m} (\bar{X}_{(i+m)} - \bar{X}_{(i-m)}) \right]\]
Usage
In practice, choice of m
influences how fast the entropy converges to the true value. For small value of m
, convergence is slow, so we recommend to scale m
according to the time series length n
and use m >= n/100
(this is just a heuristic based on the tests written for this package).
See also: entropy
, Correa
, AlizadehArghami
, Ebrahimi
, DifferentialEntropyEstimator
.
ComplexityMeasures.AlizadehArghami
— TypeAlizadehArghami <: DiffEntropyEst
AlizadehArghami(; m::Int = 1)
The AlizadehArghami
estimator computes the Shannon
differential entropy
of x
(a multi-dimensional Dataset
) using the method from Alizadeh & Arghami (2010)[Alizadeh2010].
The AlizadehArghami
estimator belongs to a class of differential entropy estimators based on order statistics. It only works for timeseries input.
Description
Assume we have samples $\bar{X} = \{x_1, x_2, \ldots, x_N \}$ from a continuous random variable $X \in \mathbb{R}$ with support $\mathcal{X}$ and density function$f : \mathbb{R} \to \mathbb{R}$. AlizadehArghami
estimates the Shannon differential entropy
\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]
However, instead of estimating the above integral directly, it makes use of the equivalent integral, where $F$ is the distribution function for $X$:
\[H(X) = \int_0^1 \log \left(\dfrac{d}{dp}F^{-1}(p) \right) dp.\]
This integral is approximated by first computing the order statistics of $\bar{X}$ (the input timeseries), i.e. $x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}$. The AlizadehArghami
Shannon
differential entropy estimate is then the the Vasicek
estimate $\hat{H}_{V}(\bar{X}, m, n)$, plus a correction factor
\[\hat{H}_{A}(\bar{X}, m, n) = \hat{H}_{V}(\bar{X}, m, n) + \dfrac{2}{n}\left(m \log(2) \right).\]
See also: entropy
, Correa
, Ebrahimi
, Vasicek
, DifferentialEntropyEstimator
.
ComplexityMeasures.Ebrahimi
— TypeEbrahimi <: DiffEntropyEst
Ebrahimi(; m::Int = 1)
The Ebrahimi
estimator computes the Shannon
entropy
of x
(a multi-dimensional Dataset
) using the method from Ebrahimi (1994)[Ebrahimi1994].
The Ebrahimi
estimator belongs to a class of differential entropy estimators based on order statistics. It only works for timeseries input.
Description
Assume we have samples $\bar{X} = \{x_1, x_2, \ldots, x_N \}$ from a continuous random variable $X \in \mathbb{R}$ with support $\mathcal{X}$ and density function$f : \mathbb{R} \to \mathbb{R}$. Ebrahimi
estimates the Shannon differential entropy
\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]
However, instead of estimating the above integral directly, it makes use of the equivalent integral, where $F$ is the distribution function for $X$,
\[H(X) = \int_0^1 \log \left(\dfrac{d}{dp}F^{-1}(p) \right) dp\]
This integral is approximated by first computing the order statistics of $\bar{X}$ (the input timeseries), i.e. $x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}$. The Ebrahimi
Shannon
differential entropy estimate is then
\[\hat{H}_{E}(\bar{X}, m) = \dfrac{1}{n} \sum_{i = 1}^n \log \left[ \dfrac{n}{c_i m} (\bar{X}_{(i+m)} - \bar{X}_{(i-m)}) \right],\]
where
\[c_i = \begin{cases} 1 + \frac{i - 1}{m}, & 1 \geq i \geq m \\ 2, & m + 1 \geq i \geq n - m \\ 1 + \frac{n - i}{m} & n - m + 1 \geq i \geq n \end{cases}.\]
See also: entropy
, Correa
, AlizadehArghami
, Vasicek
, DifferentialEntropyEstimator
.
ComplexityMeasures.Correa
— TypeCorrea <: DiffEntropyEst
Correa(; m::Int = 1)
The Correa
estimator computes the Shannon
differential entropy
of x
(a multi-dimensional Dataset
) using the method from Correa (1995)[Correa1995].
The Correa
estimator belongs to a class of differential entropy estimators based on order statistics. It only works for timeseries input.
Description
Assume we have samples $\bar{X} = \{x_1, x_2, \ldots, x_N \}$ from a continuous random variable $X \in \mathbb{R}$ with support $\mathcal{X}$ and density function$f : \mathbb{R} \to \mathbb{R}$. Correa
estimates the Shannon differential entropy
\[H(X) = \int_{\mathcal{X}} f(x) \log f(x) dx = \mathbb{E}[-\log(f(X))].\]
However, instead of estimating the above integral directly, Correa
makes use of the equivalent integral, where $F$ is the distribution function for $X$,
\[H(X) = \int_0^1 \log \left(\dfrac{d}{dp}F^{-1}(p) \right) dp\]
This integral is approximated by first computing the order statistics of $\bar{X}$ (the input timeseries), i.e. $x_{(1)} \leq x_{(2)} \leq \cdots \leq x_{(n)}$, ensuring that end points are included. The Correa
estimate of Shannon
differential entropy is then
\[H_C(\bar{X}, m, n) = \dfrac{1}{n} \sum_{i = 1}^n \log \left[ \dfrac{ \sum_{j=i-m}^{i+m}(\bar{X}_{(j)} - \tilde{X}_{(i)})(j - i)}{n \sum_{j=i-m}^{i+m} (\bar{X}_{(j)} - \tilde{X}_{(i)})^2} \right],\]
where
\[\tilde{X}_{(i)} = \dfrac{1}{2m + 1} \sum_{j = i - m}^{i + m} X_{(j)}.\]
See also: entropy
, AlizadehArghami
, Ebrahimi
, Vasicek
, DifferentialEntropyEstimator
.
- Amigó2018Amigó, J. M., Balogh, S. G., & Hernández, S. (2018). A brief review of generalized entropies. Entropy, 20(11), 813.
- Shannon1948C. E. Shannon, Bell Systems Technical Journal 27, pp 379 (1948)
- Rényi1960A. Rényi, Proceedings of the fourth Berkeley Symposium on Mathematics, Statistics and Probability, pp 547 (1960)
- Shannon1948C. E. Shannon, Bell Systems Technical Journal 27, pp 379 (1948)
- Tsallis1988Tsallis, C. (1988). Possible generalization of Boltzmann-Gibbs statistics. Journal of statistical physics, 52(1), 479-487.
- Tsallis2009Tsallis, C. (2009). Introduction to nonextensive statistical mechanics: approaching a complex world. Springer, 1(1), 2-1.
- Curado2004Curado, E. M., & Nobre, F. D. (2004). On the stability of analytic entropic forms. Physica A: Statistical Mechanics and its Applications, 335(1-2), 94-106.
- Anteneodo1999Anteneodo, C., & Plastino, A. R. (1999). Maximum entropy approach to stretched exponential probability distributions. Journal of Physics A: Mathematical and General, 32(7), 1089.
- Kraskov2004Kraskov, A., Stögbauer, H., & Grassberger, P. (2004). Estimating mutual information. Physical review E, 69(6), 066138.
- Charzyńska2016Charzyńska, A., & Gambin, A. (2016). Improvement of the k-NN entropy estimator with applications in systems biology. EntropyDefinition, 18(1), 13.
- KozachenkoLeonenko1987Kozachenko, L. F., & Leonenko, N. N. (1987). Sample estimate of the entropy of a random vector. Problemy Peredachi Informatsii, 23(2), 9-16.
- Zhu2015Zhu, J., Bellanger, J. J., Shu, H., & Le Bouquin Jeannès, R. (2015). Contribution to transfer entropy estimation via the k-nearest-neighbors approach. EntropyDefinition, 17(6), 4173-4201.
- Zhu2015Zhu, J., Bellanger, J. J., Shu, H., & Le Bouquin Jeannès, R. (2015). Contribution to transfer entropy estimation via the k-nearest-neighbors approach. EntropyDefinition, 17(6), 4173-4201.
- Singh2003Singh, H., Misra, N., Hnizdo, V., Fedorowicz, A., & Demchuk, E. (2003). Nearest neighbor estimates of entropy. American journal of mathematical and management sciences, 23(3-4), 301-321.
- Goria2005Goria, M. N., Leonenko, N. N., Mergel, V. V., & Novi Inverardi, P. L. (2005). A new class of random vector entropy estimators and its applications in testing statistical hypotheses. Journal of Nonparametric Statistics, 17(3), 277-297.
- Vasicek1976Vasicek, O. (1976). A test for normality based on sample entropy. Journal of the Royal Statistical Society: Series B (Methodological), 38(1), 54-59.
- Alizadeh2010Alizadeh, N. H., & Arghami, N. R. (2010). A new estimator of entropy. Journal of the Iranian Statistical Society (JIRSS).
- Ebrahimi1994Ebrahimi, N., Pflughoeft, K., & Soofi, E. S. (1994). Two measures of sample entropy. Statistics & Probability Letters, 20(3), 225-234.
- Correa1995Correa, J. C. (1995). A new estimator of entropy. Communications in Statistics-Theory and Methods, 24(10), 2439-2449.