Inferring causal graphs
Directed causal graphical models, estimated on observed data, is an incredibly useful framework for causal inference. There exists a plethora of methods for estimating such models.
Useful reading:
- Pearl, J. Glymour, M., & Jewell, N. P. (2016). Causal inference in statistics: A primer. John Wiley & Sons. An excellent introductory book, suitable for anyone interested, from a beginners to experts.
- Glymour, C., Zhang, K., & Spirtes, P. (2019). Review of causal discovery methods based on graphical models. Frontiers in genetics, 10, 524. The authoritative overview of causal discovery from graphical models. Many more methods have also emerged since this paper.
Causal graph API
The API for inferring causal graphs is defined by:
infer_graph
GraphAlgorithm
, and its subtypes
Associations.infer_graph
— Functioninfer_graph(algorithm::GraphAlgorithm, x) → g
Infer graph from input data x
using the given algorithm
.
Returns g
, whose type depends on algorithm
.
Associations.GraphAlgorithm
— TypeGraphAlgorithm
The supertype of all causal graph inference algorithms.
Concrete implementations
Optimal causation entropy
Associations.OCE
— TypeOCE <: GraphAlgorithm
OCE(; utest::IndependenceTest = SurrogateAssociationTest(MIShannon(), KSG2(k = 3, w = 3)),
ctest::C = LocalPermutationTest(CMIShannon(), MesnerShalizi(k = 3, w = 3)),
τmax::T = 1, α = 0.05)
The optimal causation entropy (OCE) algorithm for causal discovery Sun et al. (2015).
Description
The OCE algorithm has three steps to determine the parents of a variable xᵢ
.
- Perform pairwise independence tests using
utest
and select the variablexⱼ(-τ)
that has the highest significant (i.e. with associated p-value belowα
) association withxᵢ(0)
. Assign it to the set of selected parentsP
. - Perform conditional independence tests using
ctest
, finding the parentPₖ
that has the highest association withxᵢ
given the already selected parents, and add it toP
. Repeat until no more variables with significant association are found. - Backwards elimination of parents
Pₖ
ofxᵢ(0)
for whichxᵢ(0) ⫫ Pₖ | P - {Pₖ}
, whereP
is the set of parent nodes found in the previous steps.
τmax
indicates the maximum lag τ
between the target variable xᵢ(0)
and its potential parents xⱼ(-τ)
. Sun et al. 2015's method is based on τmax = 1
.
Returns
When used with infer_graph
, it returns a vector p
, where p[i]
are the parents for each input variable. This result can be converted to a SimpleDiGraph
from Graphs.jl (see example).
Usage
OCE
is used with infer_graph
to infer the parents of the input data. Input data must either be a Vector{Vector{<:Real}}
, or a StateSpaceSet
.
Examples
Associations.OCESelectedParents
— TypeOCESelectedParents
A simple struct for storing the parents of a single variable xᵢ
inferred by the OCE
algorithm. When using OCE
with infer_graph
, a Vector{OCESelectedParents}
is returned - one per variable in the input data.
Assumptions and notation
Assumes the input x
is a Vector{Vector{<:Real}}
or a StateSpaceSet
(for which each column is treated as a variable). It contains the following fields, where we use the notation xₖ(τ)
to indicate the k
-th variable lagged by time-lag τ
. For example, x₂(-3)
is the variable x[2]
lagged by 3 time steps.
Fields
i
: The index of the target variable (i.e.xᵢ(0)
is the target).all_idxs
: The possible variable indices of parent variables (i.e.1:M
, whereM
is the number of input variables).parents_js
: The variable indices of the selected parent variables –- one per selected parent.parents_τs
: The lags for the selected parent variables –- one per selected parent.parents
: A vector containing the raw, time-lagged data for each selected parent variables. Letτ = parents_τs[k]
andj = parents_js[k]
. Thenparents[k]
is the raw data for the variablexⱼ(-τ)
.
PC
Associations.PC
— TypePC <: GraphAlgorithm
PC(pairwise_test, conditional_test;
α = 0.05, max_depth = Inf, maxiters_orient = Inf)
The PC algorithm (Spirtes et al., 2000), which is named named after the first names of the authors, Peter Spirtes and Clark Glymour, which is implemented as described in Kalisch and Bühlmann (2008).
Arguments
pairwise_test
: AnIndependenceTest
that uses a pairwise, nondirectionalAssociationMeasure
measure (e.g. a parametricCorrTest
, orSurrogateAssociationTest
with theMIShannon
measure).conditional_test
: AnIndependenceTest
that uses a conditional, nondirectionalAssociationMeasure
(e.g.CorrTest
, orSurrogateAssociationTest
with theCMIShannon
measure).
Keyword arguments
α::Real
. The significance level of the test.max_depth
. The maximum level of conditional indendence tests to be performed. By default, there is no limit (i.e.max_depth = Inf
), meaning that maximum depth isN - 2
, whereN
is the number of input variables.maxiters_orient::Real
. The maximum number of times to apply the orientation rules. By default, there is not limit (i.e.maxiters_orient = Inf
).
During the skeleton search phase, if a significance association between two nodes are is found, then a bidirectional edge is drawn between them. The generic implementation of PC
therefore doesn't currently handle directional measures such as TEShannon
. The reason is that if a directional relationship X → Y
exists between two nodes X
and Y
, then the algorithm would first draw a bidirectional arrow between X
and Y
when analysing the direction X → Y
, and then removing it again when analysing in the direction Y → X
(a similar situation would also occur for the conditional stage). This will be fixed in a future release. For now, use nondirectional measures, e.g. MIShannon
and CMIShannon
!
Description
When used with infer_graph
on some input data x
, the PC
algorithm performs the following steps:
- Initialize an empty fully connected graph
g
withN
nodes, whereN
is the number of variables andx[i]
is the data for thei
-th node. - Reduce the fully connected
g
to a skeleton graph by performing pairwiseindependence
tests between all vertices usingpairwise_test
. Remove any edges where adjacent vertices are found to be independent according to the test (i.e. the null hypothesis of independence cannot be rejected at significance level1 - α
). - Thin the skeleton
g
by conditionalindependence
testing. Ifx[i] ⫫ x[j] | x[Z]
for some set of variablesZ
(not includingi
andj
) according toconditional_test
(i.e. the null hypothesis of conditional independence cannot be rejected at significance level1 - α
), then the edge betweeni
andj
is removed, and we record the separating set S(i, j) = Z. Independence tests are first performed for conditioning sets of size 1, and repeated for conditioning sets of increasing size, which in most cases limits the number of tests needed. The separating setsS(i, j)
, which records which variables were in the conditioning set that rendered variablesi
andj
independent, are recorded. Ifmax_depth
is an integer, then this procedure is performed on conditioning sets of sizes1:max_depth
, and ifmax_depth == nothing
, then all possible conditioning set sizes are potentially used. - Create a directed graph
dg
fromg
by replacing every undirected edgeX - Y
ing
by the bidirectional edgeX ↔ Y
(i.e. construct two directional edgesX → Y
andY → X
). Orientiation rules 0-3 are then repeatedly applied todg
until no more edges can be oriented:- Rule 0 (orients v-structures):
X ↔ Y ↔ Z
becomesX → Y ← Z
ifY
is not in the separating setS(X, Z)
. - Rule 1 (prevents new v-structures):
X → Y ↔ Z
becomesX → Y → Z
ifX
andZ
are not adjacent. - Rule 2 (avoids cycles):
X → Y → Z ↔ X
becomesX → Y → Z ← X
. - Rule 3: To avoid creating cycles or new v-structures, whenever
X - Y → Z
,X - W → Z
, andX - Z
but there is no edge betweenY
andW
, turn the undirectedX - Z
edge into the directed edgeX → Z
.
- Rule 0 (orients v-structures):
The resulting directed graph (a SimpleDiGraph
from Graphs.jl) is then returned.
Examples