When estimated using a ConditionalMutualInformationEstimator, some form of bias correction is usually applied. The FPVP estimator is a popular choice.

`CMIShannon` with `GaussianCMI`

using CausalityTools
using Distributions
using Statistics

n = 1000
# A chain X → Y → Z
x = randn(1000)
y = randn(1000) .+ x
z = randn(1000) .+ y
condmutualinfo(GaussianCMI(), x, z, y) # defaults to `CMIShannon()`

0.0017375851072253257

`CMIShannon` with `FPVP`

using CausalityTools
using Distributions
using Statistics

n = 1000
# A chain X → Y → Z
x = rand(Normal(-1, 0.5), n)
y = rand(BetaPrime(0.5, 1.5), n) .+ x
z = rand(Chisq(100), n)
z = (z ./ std(z)) .+ y

# We expect zero (in practice: very low) CMI when computing I(X; Z | Y), because
# the link between X and Z is exclusively through Y, so when observing Y,
# X and Z should appear independent.
condmutualinfo(FPVP(k = 5), x, z, y) # defaults to `CMIShannon()`

-0.14002755285440707

`CMIShannon` with `MesnerShalizi`

using CausalityTools
using Distributions
using Statistics

n = 1000
# A chain X → Y → Z
x = rand(Normal(-1, 0.5), n)
y = rand(BetaPrime(0.5, 1.5), n) .+ x
z = rand(Chisq(100), n)
z = (z ./ std(z)) .+ y

# We expect zero (in practice: very low) CMI when computing I(X; Z | Y), because
# the link between X and Z is exclusively through Y, so when observing Y,
# X and Z should appear independent.
condmutualinfo(MesnerShalizi(k = 10), x, z, y) # defaults to `CMIShannon()`

-0.12043962640339888

`CMIShannon` with `Rahimzamani`

using CausalityTools
using Distributions
using Statistics

n = 1000
# A chain X → Y → Z
x = rand(Normal(-1, 0.5), n)
y = rand(BetaPrime(0.5, 1.5), n) .+ x
z = rand(Chisq(100), n)
z = (z ./ std(z)) .+ y

# We expect zero (in practice: very low) CMI when computing I(X; Z | Y), because
# the link between X and Z is exclusively through Y, so when observing Y,
# X and Z should appear independent.
condmutualinfo(CMIShannon(base = 10), Rahimzamani(k = 10), x, z, y)

-0.03165026116054392

`CMIRenyiPoczos` with `PoczosSchneiderCMI`

using CausalityTools
using Distributions
using Statistics

n = 1000
# A chain X → Y → Z
x = rand(Normal(-1, 0.5), n)
y = rand(BetaPrime(0.5, 1.5), n) .+ x
z = rand(Chisq(100), n)
z = (z ./ std(z)) .+ y

# We expect zero (in practice: very low) CMI when computing I(X; Z | Y), because
# the link between X and Z is exclusively through Y, so when observing Y,
# X and Z should appear independent.
condmutualinfo(CMIRenyiPoczos(base = 2, q = 1.2), PoczosSchneiderCMI(k = 5), x, z, y)

-0.3927828474337882

Estimation using `MutualInformationEstimator`s

Any MutualInformationEstimator can also be used to compute conditional mutual information using the chain rule of mutual information. However, the naive application of these estimators don't perform any bias correction when taking the difference of mutual information terms.

`CMIShannon` with `KSG1`

using CausalityTools
using Distributions
using Statistics

n = 1000
# A chain X → Y → Z
x = rand(Normal(-1, 0.5), n)
y = rand(BetaPrime(0.5, 1.5), n) .+ x
z = rand(Chisq(100), n)
z = (z ./ std(z)) .+ y

# We expect zero (in practice: very low) CMI when computing I(X; Z | Y), because
# the link between X and Z is exclusively through Y, so when observing Y,
# X and Z should appear independent.
condmutualinfo(CMIShannon(base = 2), KSG1(k = 5), x, z, y)

-0.38323689721061016

Estimation using `DifferentialEntropyEstimator`s

Any DifferentialEntropyEstimator can also be used to compute conditional mutual information using a sum of entropies. However, the naive application of these estimators don't perform any bias application when taking the sum of entropy terms.

`CMIShannon` with `Kraskov`

using CausalityTools
using Distributions
n = 1000
# A chain X → Y → Z
x = rand(Epanechnikov(0.5, 1.0), n)
y = rand(Erlang(1), n) .+ x
z = rand(FDist(5, 2), n)
condmutualinfo(CMIShannon(), Kraskov(k = 5), x, z, y)

-0.2831772952525391

Estimation using `ProbabilitiesEstimator`s

Any ProbabilitiesEstimator can also be used to compute conditional mutual information using a sum of entropies. However, the naive application of these estimators don't perform any bias application when taking the sum of entropy terms.

`CMIShannon` with `ValueHistogram`

using CausalityTools
using Distributions
n = 1000
# A chain X → Y → Z
x = rand(Epanechnikov(0.5, 1.0), n)
y = rand(Erlang(1), n) .+ x
z = rand(FDist(5, 2), n)
est = ValueHistogram(RectangularBinning(5))
condmutualinfo(CMIShannon(), est, x, z, y), condmutualinfo(CMIShannon(), est, x, y, z)

(0.003883726945883348, 0.14544599577667588)

Conditional mutual information

`CMIShannon`

Estimation using `ConditionalMutualInformationEstimator`s

`CMIShannon` with `GaussianCMI`

`CMIShannon` with `FPVP`

`CMIShannon` with `MesnerShalizi`

`CMIShannon` with `Rahimzamani`

`CMIRenyiPoczos` with `PoczosSchneiderCMI`

Estimation using `MutualInformationEstimator`s

`CMIShannon` with `KSG1`

Estimation using `DifferentialEntropyEstimator`s

`CMIShannon` with `Kraskov`

Estimation using `ProbabilitiesEstimator`s

`CMIShannon` with `ValueHistogram`

Conditional mutual information

CMIShannon

Estimation using ConditionalMutualInformationEstimators

CMIShannon with GaussianCMI

CMIShannon with FPVP

CMIShannon with MesnerShalizi

CMIShannon with Rahimzamani

CMIRenyiPoczos with PoczosSchneiderCMI

Estimation using MutualInformationEstimators

CMIShannon with KSG1

Estimation using DifferentialEntropyEstimators

CMIShannon with Kraskov

Estimation using ProbabilitiesEstimators

CMIShannon with ValueHistogram

`CMIShannon`

Estimation using `ConditionalMutualInformationEstimator`s

`CMIShannon` with `GaussianCMI`

`CMIShannon` with `FPVP`

`CMIShannon` with `MesnerShalizi`

`CMIShannon` with `Rahimzamani`

`CMIRenyiPoczos` with `PoczosSchneiderCMI`

Estimation using `MutualInformationEstimator`s

`CMIShannon` with `KSG1`

Estimation using `DifferentialEntropyEstimator`s

`CMIShannon` with `Kraskov`

Estimation using `ProbabilitiesEstimator`s

`CMIShannon` with `ValueHistogram`