Associations.jl

Associations — Module

Associations

Associations.jl is a package for quantifying associations, independence testing and causal inference.

All further information is provided in the documentation, which you can either find online or build locally by running the docs/make.jl file.

Key features

Association API: includes measures and their estimators for pairwise, conditional and other forms of association from conventional statistics, from dynamical systems theory, and from information theory: partial correlation, distance correlation, (conditional) mutual information, transfer entropy, convergent cross mapping and a lot more!
Independence testing API, which is automatically compatible with every association measure estimator implemented in the package.
Causal (network) inference API integrating the association measures and independence testing framework.

Addititional features

Extending on features from ComplexityMeasures.jl, we also offer

Discretization API for multiple (multivariate) input datasets.
Multivariate counting and probability estimation API.
Multivariate information measure API

Installation

To install the package, run import Pkg; Pkg.add("Associations").

Previously, this package was called CausalityTools.jl.

source

Latest news

Package rename

The package has been renamed from CausalityTools.jl to Associations.jl.

Associations.jl has been updated to v4!

This update includes a number of breaking changes, several of which are not backwards compatible. These are done to ensure compatibility with ComplexityMeasures.jl v3, which provides discretization functionality that we use here.

Important changes are:

Convenience methods have been removed completely. Use association instead.
Example systems have been removed.
The syntax for computing an association has changed. Estimators now always contain the definition it estimates. For example, association(MIShannon(), KSG1(), x, y) is now association(KSG1(MIShannon()), x, y).
SurrogateTest has been renamed to SurrogateAssociationTest.
See the CHANGELOG.md for a complete list of changes.

Getting started

The quickest way to get going with the package is to check out the examples in the left-hand menu.

Info

To make it easier to navigate the extensive documentation, all documentation strings are collapsed by default. Click the arrow icon in the top toolbar to expand/collapse the docstrings in a page.

Documentation content

Association measures lists all implemented association measures and their estimators.
Independence testing lists all implemented ways of determining if an association between datasets is "significant".
Causal inference lists all methods of inferring association networks (also called "network graphs" and "causal graphs") between multiple variables.
Numerous examples for association measure estimation, independence testing, and network inference.

Input data for Associations.jl

Input data for Associations.jl are given as:

Univariate timeseries, which are given as standard Julia Vectors.
Multivariate timeseries, StateSpaceSets, or state space sets, which are given as StateSpaceSets. Many methods convert timeseries inputs to StateSpaceSet for faster internal computations.
Categorical data can be used with JointProbabilities to compute various information theoretic measures and is represented using any iterable whose elements can be any arbitrarily complex data type (as long as it's hashable), for example Vector{String}, {Vector{Int}}, or Vector{Tuple{Int, String}}.

StateSpaceSets.StateSpaceSet — Type

StateSpaceSet{D, T, V} <: AbstractVector{V}

A dedicated interface for sets in a state space. It is an ordered container of equally-sized points of length D, with element type T, represented by a vector of type V. Typically V is SVector{D,T} or Vector{T} and the data are always stored internally as Vector{V}. SSSet is an alias for StateSpaceSet.

The underlying Vector{V} can be obtained by vec(ssset), although this is almost never necessary because StateSpaceSet subtypes AbstractVector and extends its interface. StateSpaceSet also supports almost all sensible vector operations like append!, push!, hcat, eachrow, among others. When iterated over, it iterates over its contained points.

Construction

Constructing a StateSpaceSet is done in three ways:

By giving in each individual columns of the state space set as Vector{<:Real}: StateSpaceSet(x, y, z, ...).
By giving in a matrix whose rows are the state space points: StateSpaceSet(m).
By giving in directly a vector of vectors (state space points): StateSpaceSet(v_of_v).

All constructors allow for the keyword container which sets the type of V (the type of inner vectors). At the moment options are only SVector, MVector, or Vector, and by default SVector is used.

Description of indexing

When indexed with 1 index, StateSpaceSet behaves exactly like its encapsulated vector. i.e., a vector of vectors (state space points). When indexed with 2 indices it behaves like a matrix where each row is a point.

In the following let i, j be integers, typeof(X) <: AbstractStateSpaceSet and v1, v2 be <: AbstractVector{Int} (v1, v2 could also be ranges, and for performance benefits make v2 an SVector{Int}).

X[i] == X[i, :] gives the ith point (returns an SVector)
X[v1] == X[v1, :], returns a StateSpaceSet with the points in those indices.
X[:, j] gives the jth variable timeseries (or collection), as Vector
X[v1, v2], X[:, v2] returns a StateSpaceSet with the appropriate entries (first indices being "time"/point index, while second being variables)
X[i, j] value of the jth variable, at the ith timepoint

Use Matrix(ssset) or StateSpaceSet(matrix) to convert. It is assumed that each column of the matrix is one variable. If you have various timeseries vectors x, y, z, ... pass them like StateSpaceSet(x, y, z, ...). You can use columns(dataset) to obtain the reverse, i.e. all columns of the dataset in a tuple.

source

Maintainers and contributors

The Associations.jl software is maintained by Kristian Agasøster Haaga, who also curates and writes this documentation. Significant contributions to the API and documentation design has been made by George Datseris, which also co-authors ComplexityMeasures.jl, which we develop in tandem with this package.

A complete list of contributors to this repo are listed on the main Github page. Some important contributions are:

Norbert Genera contributed bug reports and investigations that led to subsequent improvements for the pairwise asymmetric inference algorithm and an improved cross mapping API.
David Diego's contributions were invaluable in the initial stages of development. His MATLAB code provided the basis for several transfer entropy methods and binning-related code.
George Datseris also ported KSG1 and KSG2 mutual information estimators to Neighborhood.jl.
Bjarte Hannisdal provided tutorials for mutual information.
Tor Einar Møller contributed to cross-mapping methods in initial stages of development.

Many individuals has contributed code to other packages in the JuliaDynamics ecosystem which we use here. Contributors are listed in the respective GitHub repos and webpages.