Inferring causal graphs
Directed causal graphical models, estimated on observed data, is an incredibly useful framework for causal inference. There exists a plethora of methods for estimating such models.
Useful reading:
- Pearl, J. Glymour, M., & Jewell, N. P. (2016). Causal inference in statistics: A primer. John Wiley & Sons. An excellent introductory book, suitable for anyone interested, from a beginners to experts.
- Glymour, C., Zhang, K., & Spirtes, P. (2019). Review of causal discovery methods based on graphical models. Frontiers in genetics, 10, 524. The authoritative overview of causal discovery from graphical models. Many more methods have also emerged since this paper.
Causal graph API
The API for inferring causal graphs is defined by:
infer_graphGraphAlgorithm, and its subtypes
Associations.infer_graph — Functioninfer_graph(algorithm::GraphAlgorithm, x) → gInfer graph from input data x using the given algorithm.
Returns g, whose type depends on algorithm.
Associations.GraphAlgorithm — TypeGraphAlgorithmThe supertype of all causal graph inference algorithms.
Concrete implementations
Optimal causation entropy
Associations.OCE — TypeOCE <: GraphAlgorithm
OCE(; utest::IndependenceTest = SurrogateAssociationTest(MIShannon(), KSG2(k = 3, w = 3)),
ctest::C = LocalPermutationTest(CMIShannon(), MesnerShalizi(k = 3, w = 3)),
τmax::T = 1, α = 0.05)The optimal causation entropy (OCE) algorithm for causal discovery Sun et al. (2015).
Description
The OCE algorithm has three steps to determine the parents of a variable xᵢ.
- Perform pairwise independence tests using
utestand select the variablexⱼ(-τ)that has the highest significant (i.e. with associated p-value belowα) association withxᵢ(0). Assign it to the set of selected parentsP. - Perform conditional independence tests using
ctest, finding the parentPₖthat has the highest association withxᵢgiven the already selected parents, and add it toP. Repeat until no more variables with significant association are found. - Backwards elimination of parents
Pₖofxᵢ(0)for whichxᵢ(0) ⫫ Pₖ | P - {Pₖ}, wherePis the set of parent nodes found in the previous steps.
τmax indicates the maximum lag τ between the target variable xᵢ(0) and its potential parents xⱼ(-τ). Sun et al. 2015's method is based on τmax = 1.
Returns
When used with infer_graph, it returns a vector p, where p[i] are the parents for each input variable. This result can be converted to a SimpleDiGraph from Graphs.jl (see example).
Usage
OCE is used with infer_graph to infer the parents of the input data. Input data must either be a Vector{Vector{<:Real}}, or a StateSpaceSet.
Examples
Associations.OCESelectedParents — TypeOCESelectedParentsA simple struct for storing the parents of a single variable xᵢ inferred by the OCE algorithm. When using OCE with infer_graph, a Vector{OCESelectedParents} is returned - one per variable in the input data.
Assumptions and notation
Assumes the input x is a Vector{Vector{<:Real}} or a StateSpaceSet (for which each column is treated as a variable). It contains the following fields, where we use the notation xₖ(τ) to indicate the k-th variable lagged by time-lag τ. For example, x₂(-3) is the variable x[2] lagged by 3 time steps.
Fields
i: The index of the target variable (i.e.xᵢ(0)is the target).all_idxs: The possible variable indices of parent variables (i.e.1:M, whereMis the number of input variables).parents_js: The variable indices of the selected parent variables –- one per selected parent.parents_τs: The lags for the selected parent variables –- one per selected parent.parents: A vector containing the raw, time-lagged data for each selected parent variables. Letτ = parents_τs[k]andj = parents_js[k]. Thenparents[k]is the raw data for the variablexⱼ(-τ).
PC
Associations.PC — TypePC <: GraphAlgorithm
PC(pairwise_test, conditional_test;
α = 0.05, max_depth = Inf, maxiters_orient = Inf)The PC algorithm (Spirtes et al., 2000), which is named named after the first names of the authors, Peter Spirtes and Clark Glymour, which is implemented as described in Kalisch and Bühlmann (2008).
Arguments
pairwise_test: AnIndependenceTestthat uses a pairwise, nondirectionalAssociationMeasuremeasure (e.g. a parametricCorrTest, orSurrogateAssociationTestwith theMIShannonmeasure).conditional_test: AnIndependenceTestthat uses a conditional, nondirectionalAssociationMeasure(e.g.CorrTest, orSurrogateAssociationTestwith theCMIShannonmeasure).
Keyword arguments
α::Real. The significance level of the test.max_depth. The maximum level of conditional indendence tests to be performed. By default, there is no limit (i.e.max_depth = Inf), meaning that maximum depth isN - 2, whereNis the number of input variables.maxiters_orient::Real. The maximum number of times to apply the orientation rules. By default, there is not limit (i.e.maxiters_orient = Inf).
During the skeleton search phase, if a significance association between two nodes are is found, then a bidirectional edge is drawn between them. The generic implementation of PC therefore doesn't currently handle directional measures such as TEShannon. The reason is that if a directional relationship X → Y exists between two nodes X and Y, then the algorithm would first draw a bidirectional arrow between X and Y when analysing the direction X → Y, and then removing it again when analysing in the direction Y → X (a similar situation would also occur for the conditional stage). This will be fixed in a future release. For now, use nondirectional measures, e.g. MIShannon and CMIShannon!
Description
When used with infer_graph on some input data x, the PC algorithm performs the following steps:
- Initialize an empty fully connected graph
gwithNnodes, whereNis the number of variables andx[i]is the data for thei-th node. - Reduce the fully connected
gto a skeleton graph by performing pairwiseindependencetests between all vertices usingpairwise_test. Remove any edges where adjacent vertices are found to be independent according to the test (i.e. the null hypothesis of independence cannot be rejected at significance level1 - α). - Thin the skeleton
gby conditionalindependencetesting. Ifx[i] ⫫ x[j] | x[Z]for some set of variablesZ(not includingiandj) according toconditional_test(i.e. the null hypothesis of conditional independence cannot be rejected at significance level1 - α), then the edge betweeniandjis removed, and we record the separating set S(i, j) = Z. Independence tests are first performed for conditioning sets of size 1, and repeated for conditioning sets of increasing size, which in most cases limits the number of tests needed. The separating setsS(i, j), which records which variables were in the conditioning set that rendered variablesiandjindependent, are recorded. Ifmax_depthis an integer, then this procedure is performed on conditioning sets of sizes1:max_depth, and ifmax_depth == nothing, then all possible conditioning set sizes are potentially used. - Create a directed graph
dgfromgby replacing every undirected edgeX - Yingby the bidirectional edgeX ↔ Y(i.e. construct two directional edgesX → YandY → X). Orientiation rules 0-3 are then repeatedly applied todguntil no more edges can be oriented:- Rule 0 (orients v-structures):
X ↔ Y ↔ ZbecomesX → Y ← ZifYis not in the separating setS(X, Z). - Rule 1 (prevents new v-structures):
X → Y ↔ ZbecomesX → Y → ZifXandZare not adjacent. - Rule 2 (avoids cycles):
X → Y → Z ↔ XbecomesX → Y → Z ← X. - Rule 3: To avoid creating cycles or new v-structures, whenever
X - Y → Z,X - W → Z, andX - Zbut there is no edge betweenYandW, turn the undirectedX - Zedge into the directed edgeX → Z.
- Rule 0 (orients v-structures):
The resulting directed graph (a SimpleDiGraph from Graphs.jl) is then returned.
Examples