# CausalityTools.jl

`CausalityTools`

is a Julia package that provides algorithms for *detecting dynamical influences* and *causal inference* based on time series data, and other commonly used measures of dependence and association.

You are reading the development version of the documentation of CausalityTools.jl that will become version 2.0.

## Content

The goal of CausalityTools.jl is to provide an easily extendable library of univariate, bivariate and multivariate measures of complexity, association and (directional) dependence between data of various kinds. We currently offer:

- A suite of information measures, such as
`mutualinfo`

,`condmutualinfo`

and`transferentropy`

, along with a plethora of estimators for computation of discrete and continuous variants of these measures. - A generic cross-map interface for causal inference methods based on state space prediction methods. This includes measures such as
`ConvergentCrossMapping`

and`PairwiseAsymmetricInference`

.

Other measures are found in the menu.

## Input data

Input data for CausalityTools are given as:

- Univariate
*timeseries*, which are given as standard Julia`Vector`

s. - Multivariate timeseries,
*datasets*, or*state space sets*, which are given as`Dataset`

s. Many methods convert*timeseries*inputs to`Dataset`

for faster internal computations. - Categorical data can be used with
`ContingencyMatrix`

to compute various information theoretic measures and is represented using any iterable whose elements can be any arbitrarily complex data type (as long as it's hashable), for example`Vector{String}`

,`{Vector{Int}}`

, or`Vector{Tuple{Int, String}}`

.

`StateSpaceSets.Dataset`

— Type`Dataset{D, T} <: AbstractDataset{D,T}`

A dedicated interface for datasets. It contains *equally-sized datapoints* of length `D`

, represented by `SVector{D, T}`

. These data are a standard Julia `Vector{SVector}`

, and can be obtained with `vec(dataset)`

.

When indexed with 1 index, a `dataset`

is like a vector of datapoints. When indexed with 2 indices it behaves like a matrix that has each of the columns be the timeseries of each of the variables.

`Dataset`

also supports most sensible operations like `append!, push!, hcat, eachrow`

, among others, and when iterated over, it iterates over its contained points.

**Description of indexing**

In the following let `i, j`

be integers, `typeof(data) <: AbstractDataset`

and `v1, v2`

be `<: AbstractVector{Int}`

(`v1, v2`

could also be ranges, and for massive performance benefits make `v2`

an `SVector{X, Int}`

).

`data[i] == data[i, :]`

gives the`i`

th datapoint (returns an`SVector`

)`data[v1] == data[v1, :]`

, returns a`Dataset`

with the points in those indices.`data[:, j]`

gives the`j`

th variable timeseries, as`Vector`

`data[v1, v2], data[:, v2]`

returns a`Dataset`

with the appropriate entries (first indices being "time"/point index, while second being variables)`data[i, j]`

value of the`j`

th variable, at the`i`

th timepoint

Use `Matrix(dataset)`

or `Dataset(matrix)`

to convert. It is assumed that each *column* of the `matrix`

is one variable. If you have various timeseries vectors `x, y, z, ...`

pass them like `Dataset(x, y, z, ...)`

. You can use `columns(dataset)`

to obtain the reverse, i.e. all columns of the dataset in a tuple.

This package has been and is under heavy development. Don't hesitate to submit an issue if you find something that doesn't work or doesn't make sense, or if there's some functionality that you're missing. Pull requests are also very welcome!