Associations.jl
Associations
— ModuleAssociations
Associations.jl is a package for quantifying associations, independence testing and causal inference.
All further information is provided in the documentation, which you can either find online or build locally by running the docs/make.jl
file.
Key features
- Association API: includes measures and their estimators for pairwise, conditional and other forms of association from conventional statistics, from dynamical systems theory, and from information theory: partial correlation, distance correlation, (conditional) mutual information, transfer entropy, convergent cross mapping and a lot more!
- Independence testing API, which is automatically compatible with every association measure estimator implemented in the package.
- Causal (network) inference API integrating the association measures and independence testing framework.
Addititional features
Extending on features from ComplexityMeasures.jl, we also offer
- Discretization API for multiple (multivariate) input datasets.
- Multivariate counting and probability estimation API.
- Multivariate information measure API
Installation
To install the package, run import Pkg; Pkg.add("Associations")
.
Previously, this package was called CausalityTools.jl.
Latest news
The package has been renamed from CausalityTools.jl to Associations.jl.
Associations.jl has been updated to v4!
This update includes a number of breaking changes, several of which are not backwards compatible. These are done to ensure compatibility with ComplexityMeasures.jl v3, which provides discretization functionality that we use here.
Important changes are:
- Convenience methods have been removed completely. Use
association
instead. - Example systems have been removed.
- The syntax for computing an association has changed. Estimators now always contain the definition it estimates. For example,
association(MIShannon(), KSG1(), x, y)
is nowassociation(KSG1(MIShannon()), x, y)
. SurrogateTest
has been renamed toSurrogateAssociationTest
.- See the CHANGELOG.md for a complete list of changes.
Getting started
The quickest way to get going with the package is to check out the examples in the left-hand menu.
To make it easier to navigate the extensive documentation, all documentation strings are collapsed by default. Click the arrow icon in the top toolbar to expand/collapse the docstrings in a page.
Documentation content
- Association measures lists all implemented association measures and their estimators.
- Independence testing lists all implemented ways of determining if an association between datasets is "significant".
- Causal inference lists all methods of inferring association networks (also called "network graphs" and "causal graphs") between multiple variables.
- Numerous examples for association measure estimation, independence testing, and network inference.
Input data for Associations.jl
Input data for Associations.jl are given as:
- Univariate timeseries, which are given as standard Julia
Vector
s. - Multivariate timeseries, StateSpaceSets, or state space sets, which are given as
StateSpaceSet
s. Many methods convert timeseries inputs toStateSpaceSet
for faster internal computations. - Categorical data can be used with
JointProbabilities
to compute various information theoretic measures and is represented using any iterable whose elements can be any arbitrarily complex data type (as long as it's hashable), for exampleVector{String}
,{Vector{Int}}
, orVector{Tuple{Int, String}}
.
StateSpaceSets.StateSpaceSet
— TypeStateSpaceSet{D, T, V} <: AbstractVector{V}
A dedicated interface for sets in a state space. It is an ordered container of equally-sized points of length D
, with element type T
, represented by a vector of type V
. Typically V
is SVector{D,T}
or Vector{T}
and the data are always stored internally as Vector{V}
. SSSet
is an alias for StateSpaceSet
.
The underlying Vector{V}
can be obtained by vec(ssset)
, although this is almost never necessary because StateSpaceSet
subtypes AbstractVector
and extends its interface. StateSpaceSet
also supports almost all sensible vector operations like append!, push!, hcat, eachrow
, among others. When iterated over, it iterates over its contained points.
Construction
Constructing a StateSpaceSet
is done in three ways:
- By giving in each individual columns of the state space set as
Vector{<:Real}
:StateSpaceSet(x, y, z, ...)
. - By giving in a matrix whose rows are the state space points:
StateSpaceSet(m)
. - By giving in directly a vector of vectors (state space points):
StateSpaceSet(v_of_v)
.
All constructors allow for the keyword container
which sets the type of V
(the type of inner vectors). At the moment options are only SVector
, MVector
, or Vector
, and by default SVector
is used.
Description of indexing
When indexed with 1 index, StateSpaceSet
behaves exactly like its encapsulated vector. i.e., a vector of vectors (state space points). When indexed with 2 indices it behaves like a matrix where each row is a point.
In the following let i, j
be integers, typeof(X) <: AbstractStateSpaceSet
and v1, v2
be <: AbstractVector{Int}
(v1, v2
could also be ranges, and for performance benefits make v2
an SVector{Int}
).
X[i] == X[i, :]
gives thei
th point (returns anSVector
)X[v1] == X[v1, :]
, returns aStateSpaceSet
with the points in those indices.X[:, j]
gives thej
th variable timeseries (or collection), asVector
X[v1, v2], X[:, v2]
returns aStateSpaceSet
with the appropriate entries (first indices being "time"/point index, while second being variables)X[i, j]
value of thej
th variable, at thei
th timepoint
Use Matrix(ssset)
or StateSpaceSet(matrix)
to convert. It is assumed that each column of the matrix
is one variable. If you have various timeseries vectors x, y, z, ...
pass them like StateSpaceSet(x, y, z, ...)
. You can use columns(dataset)
to obtain the reverse, i.e. all columns of the dataset in a tuple.
Maintainers and contributors
The Associations.jl software is maintained by Kristian Agasøster Haaga, who also curates and writes this documentation. Significant contributions to the API and documentation design has been made by George Datseris, which also co-authors ComplexityMeasures.jl, which we develop in tandem with this package.
A complete list of contributors to this repo are listed on the main Github page. Some important contributions are:
- Norbert Genera contributed bug reports and investigations that led to subsequent improvements for the pairwise asymmetric inference algorithm and an improved cross mapping API.
- David Diego's contributions were invaluable in the initial stages of development. His MATLAB code provided the basis for several transfer entropy methods and binning-related code.
- George Datseris also ported KSG1 and KSG2 mutual information estimators to Neighborhood.jl.
- Bjarte Hannisdal provided tutorials for mutual information.
- Tor Einar Møller contributed to cross-mapping methods in initial stages of development.
Many individuals has contributed code to other packages in the JuliaDynamics ecosystem which we use here. Contributors are listed in the respective GitHub repos and webpages.