ComplexityMeasures.jl

ComplexityMeasuresModule

ComplexityMeasures.jl

docsdev docsstable CI codecov Package Downloads Package Downloads publication

ComplexityMeasures.jl is a Julia-based software for calculating 1000s of various kinds of probabilities, entropies, and other so-called complexity measures from a single-variable input datasets. For relational measures across many input datasets see its extension Associations.jl. If you are a user of other programming languages (Python, R, MATLAB, ...), you can still use ComplexityMeasures.jl due to Julia's interoperability. For example, for Python use juliacall.

A careful comparison with alternative widely used software shows that ComplexityMeasures.jl outclasses the alternatives in several objective aspects of comparison, such as computational performance, overall amount of measures, reliability, and extendability. See the associated publication for more details.

The key features that ComplexityMeasures.jl provides can be summarized as:

  • A rigorous framework for extracting probabilities from data, based on the mathematical formulation of probability spaces.
  • Several (12+) outcome spaces, i.e., ways to discretize data into probabilities.
  • Several estimators for estimating probabilities given an outcome space, which correct theoretically known estimation biases.
  • Several definitions of information measures, such as various flavours of entropies (Shannon, Tsallis, Curado...), extropies, and other complexity measures, that are used in the context of nonlinear dynamics, nonlinear timeseries analysis, and complex systems.
  • Several discrete and continuous (differential) estimators for entropies, which correct theoretically known estimation biases.
  • An extendable interface and well thought out API accompanied by dedicated developer documentation. This makes it trivial to define new outcome spaces, or new estimators for probabilities, information measures, or complexity measures and integrate them with everything else in the software without boilerplate code.

ComplexityMeasures.jl can be used as a standalone package, or as part of other projects in the JuliaDynamics organization, such as DynamicalSystems.jl or Associations.jl.

To install it, run import Pkg; Pkg.add("ComplexityMeasures").

All further information is provided in the documentation, which you can either find online or build locally by running the docs/make.jl file.

Previously, this package was called Entropies.jl.

source

Latest news

ComplexityMeasures.jl has been updated to v3!

The software has been massively improved and its core principles were redesigned to be extendable, accessible, and more closely based on the rigorous mathematics of probabilities and entropies.

For more details of this new release, please see our announcement post on discourse or the central Tutorial of the v3 documentation.

In this v3 many concepts were renamed, but there is no formally breaking change. Everything that changed has been deprecated and is backwards compatible. You can see the CHANGELOG.md for more details!

Documentation contents

  • Before anything else, we recommend users to go through our overarching Tutorial, which teaches not only central API functions, but also terminology and crucial core concepts:
  • Probabilities lists all outcome spaces and probabilities estimators.
  • Information measures lists all implemented information measure definitions and estimators (both discrete and differential).
  • Complexity measures lists all implemented complexity measures that are not functionals of probabilities (unlike information measures).
  • The Examples page lists dozens of runnable example code snippets along with their outputs.

Input data for ComplexityMeasures.jl

The input data type typically depend on the outcome space chosen. In general though, the standard DynamicalSystems.jl approach is taken and as such we have three types of input data:

  • Timeseries, which are AbstractVector{<:Real}, used in e.g. with WaveletOverlap.
  • Multi-variate timeseries, or datasets, or state space sets, which are StateSpaceSets, used e.g. with NaiveKernel. The short syntax SSSet may be used instead of StateSpaceSet.
  • Spatial data, which are higher dimensional standard Arrays, used e.g. with SpatialOrdinalPatterns.
StateSpaceSets.StateSpaceSetType
StateSpaceSet{D, T, V} <: AbstractVector{V}

A dedicated interface for sets in a state space. It is an ordered container of equally-sized points of length D, with element type T, represented by a vector of type V. Typically V is SVector{D,T} or Vector{T} and the data are always stored internally as Vector{V}. SSSet is an alias for StateSpaceSet.

The underlying Vector{V} can be obtained by vec(ssset), although this is almost never necessary because StateSpaceSet subtypes AbstractVector and extends its interface. StateSpaceSet also supports almost all sensible vector operations like append!, push!, hcat, eachrow, among others. When iterated over, it iterates over its contained points.

Construction

Constructing a StateSpaceSet is done in three ways:

  1. By giving in each individual columns of the state space set as Vector{<:Real}: StateSpaceSet(x, y, z, ...).
  2. By giving in a matrix whose rows are the state space points: StateSpaceSet(m).
  3. By giving in directly a vector of vectors (state space points): StateSpaceSet(v_of_v).

All constructors allow for two keywords:

  • container which sets the type of V (the type of inner vectors). At the moment options are only SVector, MVector, or Vector, and by default SVector is used.
  • names which can be an iterable of length D whose elements are Symbols. This allows assigning a name to each dimension and accessing the dimension by name, see below. names is nothing if not given. Use StateSpaceSet(s; names) to add names to an existing set s.

Description of indexing

When indexed with 1 index, StateSpaceSet behaves exactly like its encapsulated vector. i.e., a vector of vectors (state space points). When indexed with 2 indices it behaves like a matrix where each row is a point.

In the following let i, j be integers, typeof(X) <: AbstractStateSpaceSet and v1, v2 be <: AbstractVector{Int} (v1, v2 could also be ranges, and for performance benefits make v2 an SVector{Int}).

  • X[i] == X[i, :] gives the ith point (returns an SVector)
  • X[v1] == X[v1, :], returns a StateSpaceSet with the points in those indices.
  • X[:, j] gives the jth variable timeseries (or collection), as Vector
  • X[v1, v2], X[:, v2] returns a StateSpaceSet with the appropriate entries (first indices being "time"/point index, while second being variables)
  • X[i, j] value of the jth variable, at the ith timepoint

In all examples above, j can also be a Symbol, provided that names has been given when creating the state space set. This allows accessing a dimension by name. This is provided as a convenience and it is not an optimized operation, hence recommended to be used primarily with X[:, j::Symbol].

Use Matrix(ssset) or StateSpaceSet(matrix) to convert. It is assumed that each column of the matrix is one variable. If you have various timeseries vectors x, y, z, ... pass them like StateSpaceSet(x, y, z, ...). You can use columns(dataset) to obtain the reverse, i.e. all columns of the dataset in a tuple.

source

Total entropy/information/complexity measures

ComplexityMeasures.jl offers thousands of measures computable right out of the box. To see an exact number of how many, see this calculation page.