Inferring causal graphs

Directed causal graphical models, estimated on observed data, is an incredibly useful framework for causal inference. There exists a plethora of methods for estimating such models.

Useful reading:

Pearl, J. Glymour, M., & Jewell, N. P. (2016). Causal inference in statistics: A primer. John Wiley & Sons. An excellent introductory book, suitable for anyone interested, from a beginners to experts.
Glymour, C., Zhang, K., & Spirtes, P. (2019). Review of causal discovery methods based on graphical models. Frontiers in genetics, 10, 524. The authoritative overview of causal discovery from graphical models. Many more methods have also emerged since this paper.

Causal graph API

The API for inferring causal graphs is defined by:

infer_graph
GraphAlgorithm, and its subtypes

Associations.infer_graph — Function

infer_graph(algorithm::GraphAlgorithm, x) → g

Infer graph from input data x using the given algorithm.

Returns g, whose type depends on algorithm.

source

Associations.GraphAlgorithm — Type

GraphAlgorithm

The supertype of all causal graph inference algorithms.

Concrete implementations

OCE. The optimal causation entropy algorithm for time series graphs.
PC.

source

Optimal causation entropy

Associations.OCE — Type

OCE <: GraphAlgorithm
OCE(; utest::IndependenceTest = SurrogateAssociationTest(MIShannon(), KSG2(k = 3, w = 3)),
      ctest::C = LocalPermutationTest(CMIShannon(), MesnerShalizi(k = 3, w = 3)),
      τmax::T = 1, α = 0.05)

The optimal causation entropy (OCE) algorithm for causal discovery Sun et al. (2015).

Description

The OCE algorithm has three steps to determine the parents of a variable xᵢ.

Perform pairwise independence tests using utest and select the variable xⱼ(-τ) that has the highest significant (i.e. with associated p-value below α) association with xᵢ(0). Assign it to the set of selected parents P.
Perform conditional independence tests using ctest, finding the parent Pₖ that has the highest association with xᵢ given the already selected parents, and add it to P. Repeat until no more variables with significant association are found.
Backwards elimination of parents Pₖ of xᵢ(0) for which xᵢ(0) ⫫ Pₖ | P - {Pₖ}, where P is the set of parent nodes found in the previous steps.

τmax indicates the maximum lag τ between the target variable xᵢ(0) and its potential parents xⱼ(-τ). Sun et al. 2015's method is based on τmax = 1.

Returns

When used with infer_graph, it returns a vector p, where p[i] are the parents for each input variable. This result can be converted to a SimpleDiGraph from Graphs.jl (see example).

Usage

OCE is used with infer_graph to infer the parents of the input data. Input data must either be a Vector{Vector{<:Real}}, or a StateSpaceSet.

Examples

Inferring time series graph from a chain of logistic maps

source

Associations.OCESelectedParents — Type

OCESelectedParents

A simple struct for storing the parents of a single variable xᵢ inferred by the OCE algorithm. When using OCE with infer_graph, a Vector{OCESelectedParents} is returned - one per variable in the input data.

Assumptions and notation

Assumes the input x is a Vector{Vector{<:Real}} or a StateSpaceSet (for which each column is treated as a variable). It contains the following fields, where we use the notation xₖ(τ) to indicate the k-th variable lagged by time-lag τ. For example, x₂(-3) is the variable x[2] lagged by 3 time steps.

Fields

i: The index of the target variable (i.e. xᵢ(0) is the target).
all_idxs: The possible variable indices of parent variables (i.e. 1:M, where M is the number of input variables).
parents_js: The variable indices of the selected parent variables –- one per selected parent.
parents_τs: The lags for the selected parent variables –- one per selected parent.
parents: A vector containing the raw, time-lagged data for each selected parent variables. Let τ = parents_τs[k] and j = parents_js[k]. Then parents[k] is the raw data for the variable xⱼ(-τ).

source

PC

Associations.PC — Type

PC <: GraphAlgorithm
PC(pairwise_test, conditional_test;
    α = 0.05, max_depth = Inf, maxiters_orient = Inf)

The PC algorithm (Spirtes et al., 2000), which is named named after the first names of the authors, Peter Spirtes and Clark Glymour, which is implemented as described in Kalisch and Bühlmann (2008).

Arguments

pairwise_test: An IndependenceTest that uses a pairwise, nondirectional AssociationMeasure measure (e.g. a parametric CorrTest, or SurrogateAssociationTest with the MIShannon measure).
conditional_test: An IndependenceTest that uses a conditional, nondirectional AssociationMeasure (e.g. CorrTest, or SurrogateAssociationTest with the CMIShannon measure).

Keyword arguments

α::Real. The significance level of the test.
max_depth. The maximum level of conditional indendence tests to be performed. By default, there is no limit (i.e. max_depth = Inf), meaning that maximum depth is N - 2, where N is the number of input variables.
maxiters_orient::Real. The maximum number of times to apply the orientation rules. By default, there is not limit (i.e. maxiters_orient = Inf).

Directional measures will not give meaningful answers

During the skeleton search phase, if a significance association between two nodes are is found, then a bidirectional edge is drawn between them. The generic implementation of PC therefore doesn't currently handle directional measures such as TEShannon. The reason is that if a directional relationship X → Y exists between two nodes X and Y, then the algorithm would first draw a bidirectional arrow between X and Y when analysing the direction X → Y, and then removing it again when analysing in the direction Y → X (a similar situation would also occur for the conditional stage). This will be fixed in a future release. For now, use nondirectional measures, e.g. MIShannon and CMIShannon!

Description

When used with infer_graph on some input data x, the PC algorithm performs the following steps:

Initialize an empty fully connected graph g with N nodes, where N is the number of variables and x[i] is the data for the i-th node.
Reduce the fully connected g to a skeleton graph by performing pairwise independence tests between all vertices using pairwise_test. Remove any edges where adjacent vertices are found to be independent according to the test (i.e. the null hypothesis of independence cannot be rejected at significance level 1 - α).
Thin the skeleton g by conditional independence testing. If x[i] ⫫ x[j] | x[Z] for some set of variables Z (not including i and j) according to conditional_test (i.e. the null hypothesis of conditional independence cannot be rejected at significance level 1 - α), then the edge between i and j is removed, and we record the separating set S(i, j) = Z. Independence tests are first performed for conditioning sets of size 1, and repeated for conditioning sets of increasing size, which in most cases limits the number of tests needed. The separating sets S(i, j), which records which variables were in the conditioning set that rendered variables i and j independent, are recorded. If max_depth is an integer, then this procedure is performed on conditioning sets of sizes 1:max_depth, and if max_depth == nothing, then all possible conditioning set sizes are potentially used.
Create a directed graph dg from g by replacing every undirected edge X - Y in g by the bidirectional edge X ↔ Y (i.e. construct two directional edges X → Y and Y → X). Orientiation rules 0-3 are then repeatedly applied to dg until no more edges can be oriented:
- Rule 0 (orients v-structures): X ↔ Y ↔ Z becomes X → Y ← Z if Y is not in the separating set S(X, Z).
- Rule 1 (prevents new v-structures): X → Y ↔ Z becomes X → Y → Z if X and Z are not adjacent.
- Rule 2 (avoids cycles): X → Y → Z ↔ X becomes X → Y → Z ← X.
- Rule 3: To avoid creating cycles or new v-structures, whenever X - Y → Z, X - W → Z, and X - Z but there is no edge between Y and W, turn the undirected X - Z edge into the directed edge X → Z.

The resulting directed graph (a SimpleDiGraph from Graphs.jl) is then returned.

Examples

source