Histogram estimation
Entropies.non0hist
— MethodHistograms from collections
non0hist(x::AbstractVector; normalize::Bool = true) → p::Vector{Float64}
Compute the unordered histogram of the values of x
, directly from the distribution of values, without any coarse-graining or discretization. Assumes that x
can be sorted.
If normalize==true
, then the histogram is sum-normalized. If normalize==false
, then occurrence counts for the unique elements in x
is returned.
Example
using Entropies
x = rand(1:10, 100000)
Entropies.non0hist(x) # sum-normalized
Entropies.non0hist(x, normalize = false) # histogram (counts)
Histograms of Dataset
s
non0hist(x::AbstractDataset; normalize::Bool = true) → p::Vector{Float64}
Compute the unordered histogram of the values of the Dataset
x
, directly from the distribution of points, without any coarse-graining or discretization.
Example
using DelayEmbeddings, Entropies
D = Dataset(rand(1:3, 50000, 3))
Entropies.non0hist(D) # sum-normalized
Entropies.non0hist(D, normalize = false) # histogram (counts)
non0hist(ε, dataset::AbstractDataset; normalize = true) → p
Partition a dataset into tabulated intervals (boxes) of size ε
and return the (sum-normalized, if normalize==true
) histogram in an unordered 1D form, discarding all zero elements and bin edge information.
Performances Notes
This method has a linearithmic time complexity (n log(n)
for n = length(data)
) and a linear space complexity (l
for l = dimension(data)
). This allows computation of histograms of high-dimensional datasets and with small box sizes ε
without memory overflow and with maximum performance.
Use binhist
to retain bin edge information.