Histogram estimation

Entropies.non0histMethod

Histograms from collections

non0hist(x::AbstractVector; normalize::Bool = true) → p::Vector{Float64}

Compute the unordered histogram of the values of x, directly from the distribution of values, without any coarse-graining or discretization. Assumes that x can be sorted.

If normalize==true, then the histogram is sum-normalized. If normalize==false, then occurrence counts for the unique elements in x is returned.

Example

using Entropies
x = rand(1:10, 100000)
Entropies.non0hist(x) # sum-normalized
Entropies.non0hist(x, normalize = false) # histogram (counts)

Histograms of Datasets

non0hist(x::AbstractDataset; normalize::Bool = true) → p::Vector{Float64}

Compute the unordered histogram of the values of the Dataset x , directly from the distribution of points, without any coarse-graining or discretization.

Example

using DelayEmbeddings, Entropies
D = Dataset(rand(1:3, 50000, 3))
Entropies.non0hist(D) # sum-normalized
Entropies.non0hist(D, normalize = false) # histogram (counts)
source
non0hist(ε, dataset::AbstractDataset; normalize = true) → p

Partition a dataset into tabulated intervals (boxes) of size ε and return the (sum-normalized, if normalize==true) histogram in an unordered 1D form, discarding all zero elements and bin edge information.

Performances Notes

This method has a linearithmic time complexity (n log(n) for n = length(data)) and a linear space complexity (l for l = dimension(data)). This allows computation of histograms of high-dimensional datasets and with small box sizes ε without memory overflow and with maximum performance.

Use binhist to retain bin edge information.

source