Convenience functions for TE estimation
TE estimation between two data series
#
CausalityTools.te_reg — Function.
te_reg(source::AbstractArray{<:Real, 1}, 
    response::AbstractArray{<:Real, 1}, 
    k::Int, l::Int, m::Int; 
    η = 1, τ = 1, 
    estimator = VisitationFrequency(b = 2), 
    n_subdivs = 1)
TE estimation with default discretization scheme(s)
Calculate transfer entropy from source to response using the provided  estimator on a rectangular discretization of a k + l + m-dimensional  delay embedding of the input data, using an embedding delay of τ across  all embedding components. η is the prediction lag. 
Arguments
- source: The source data series.
- target: The target data series.
- k: The dimension of the T_{f} component of the embedding.
- l: The dimension of the T_{pp} component of the embedding.
- m: The dimension of the S_{pp} component of the embedding.
Keyword arguments
- τ: The embedding lag. Default is- τ = 1.
- η: The prediction lag. Default is- η = 1.
- estimator: The transfer entropy estimator to use. The default is- VisitationFrequency().
- n_subdivs: The number of different partitions of varying coarseness to compute TE over, as described below. Default is- n_subdivs = 2. (this way, TE is computed over two separate partitions). Unless- n_subdivs = 0, make sure that- n_subdivsis the same across analyses if they are to be compared. T TE
More about the embedding
To compute transfer entropy, we need an appropriate delay embedding  of source (S) and target (T). For convenience, define 
where T_f denotes the k-dimensional set of vectors furnishing the future states of T, T_{pp} denotes the l-dimensional set of vectors furnishing the past and present states of T,  and S_{pp} denotes the m-dimensional set of vectors furnishing the past and present of S.  \eta is the prediction lag. This convenience function uses \tau_1 = τ,  \tau_2 = 2*τ, \tau_3 = 3*τ, and so on.
Combined, we get the generalised embedding \mathbb{E} = (T_f^{(k)}, T_{pp}^{(l)}, S_{pp}^{(m)}), which is discretized as described below.
More about discretization
To compute TE, we coarse-grain the k+l+m-dimensional generalised embedding  \mathbb{E} = (T_f^{(k)}, T_{pp}^{(l)}, S_{pp}^{(m)}) into hyperrectangular boxes. The magnitude of the TE may be biased by the particular choice of binning scheme, so we compute TE across a number of different box sizes, determined as follows.
Let L be the number of observations in S (and T). The coarsest box size is determined by subdiving the i-th coordinate axis into N = ceiling(L^\frac{1}{k + l + m + 1}) intervals of equal lengths, resulting in a box size of |max(dim_{i}) - min(dim_{i})|/N. The next box size is given by |max(dim_{i}) - min(dim_{i})|/(N+1), then |max(dim_{i}) - min(dim_{i})|/(N+2), and so on, until the finest box size which is given by |max(dim_{i}) - min(dim_{i})|/(N+N_{subdivs}).
Transfer entropy computation
TE is then computed as
using the provided estimator (default = VisitationFrequency) for  each of the discretizations. A vector of the TE estimates for each discretization  is returned.
te_reg(source::AbstractArray{<:Real, 1}, 
    response::AbstractArray{<:Real, 1}, 
    k::Int, l::Int, m::Int,
    binning_scheme::Vector{RectangularBinning}; 
    η = 1, τ = 1, 
    estimator = VisitationFrequency(b = 2))
TE with user-provided discretization scheme(s)
Calculate transfer entropy from source to response using the provided  estimator on discretizations constructed by the provided binning_scheme(s)  over a k + l + m-dimensional delay embedding of the input data,  using an embedding delay of τ across all embedding components.  η is the prediction lag. 
Arguments
- source: The source data series.
- target: The target data series.
- k: The dimension of the T_{f} component of the embedding.
- l: The dimension of the T_{pp} component of the embedding.
- m: The dimension of the S_{pp} component of the embedding.
- binning_scheme: The binning scheme(s) used to construct the partitions over which TE is computed. Must be either one or several instances of- RectangularBinnings (provided as a vector). TE is computed for each of the resulting partitions.
Keyword arguments
- τ: The embedding lag. Default is- τ = 1.
- η: The prediction lag. Default is- η = 1.
- estimator: The transfer entropy estimator to use. The default is- VisitationFrequency().
More about the embedding
To compute transfer entropy, we need an appropriate delay embedding  of source (S) and target (T). For convenience, define 
where T_f denotes the k-dimensional set of vectors furnishing the future states of T, T_{pp} denotes the l-dimensional set of vectors furnishing the past and present states of T,  and S_{pp} denotes the m-dimensional set of vectors furnishing the past and present of S.  \eta is the prediction lag. This convenience function uses \tau_1 = τ,  \tau_2 = 2*τ, \tau_3 = 3*τ, and so on.
Combined, we get the generalised embedding \mathbb{E} = (T_f^{(k)}, T_{pp}^{(l)}, S_{pp}^{(m)}), which is discretized as described below.
More about discretization
The discretization scheme must be either a single RectangularBinning instance, or a vector of  RectangularBinning instances. Run ?RectangularBinning after loading CausalityTools for  details.
Transfer entropy computation
TE is then computed as
using the provided estimator (default = VisitationFrequency) for  each of the discretizations. A vector of the TE estimates for each discretization  is returned.
TE estimation between two data series conditioned on third series
#
CausalityTools.te_cond — Function.
te_cond(source::AbstractArray{<:Real, 1}, 
    response::AbstractArray{<:Real, 1},
    cond::AbstractArray{<:Real, 1},
    k::Int, l::Int, m::Int, n::Int; 
    η = 1, τ = 1, 
    estimator = VisitationFrequency(b = 2), 
    n_subdivs = 2)
Conditional TE with default discretization scheme(s)
Calculate transfer entropy from source to response conditioned on cond using the provided  estimator on a rectangular discretization of a k + l + m + n-dimensional  delay embedding of the input data, using an embedding delay of τ across  all embedding components. η is the prediction lag. 
Arguments
- source: The source data series.
- target: The target data series.
- cond: The conditional data series.
- k: The dimension of the T_{f} component of the embedding.
- l: The dimension of the T_{pp} component of the embedding.
- m: The dimension of the S_{pp} component of the embedding.
- n: The dimension of the C_{pp} component of the embedding.
Keyword arguments
- τ: The embedding lag. Default is- τ = 1.
- η: The prediction lag. Default is- η = 1.
- estimator: The transfer entropy estimator to use. The default is- VisitationFrequency().
- n_subdivs: The number of different partitions of varying coarseness to compute TE over, as described below. Default is- n_subdivs = 2. (this way, TE is computed over two separate partitions). Unless- n_subdivs = 0, make sure that- n_subdivsis the same across analyses if they are to be compared. T TE
More about the embedding
To compute transfer entropy, we need an appropriate delay embedding  of source (S), target (T) and cond (C). For convenience, define 
where T_f denotes the k-dimensional set of vectors furnishing the future states of T, T_{pp} denotes the l-dimensional set of vectors furnishing the past and present states of T,  ,S_{pp} denotes the m-dimensional set of vectors furnishing the past and present of S,  and C_{pp} denotes the n-dimensional set of vectors furnishing the past and present of C. \eta is the prediction lag. This convenience function uses \tau_1 = τ,  \tau_2 = 2*τ, \tau_3 = 3*τ, and so on.
Combined, we get the generalised embedding \mathbb{E} = (T_f^{(k)}, T_{pp}^{(l)}, S_{pp}^{(m)}, C_{pp}^{(n)}), which is discretized as described below.
More about discretization
To compute TE, we coarse-grain the k+l+m+n-dimensional generalised embedding  \mathbb{E} = (T_f^{(k)}, T_{pp}^{(l)}, S_{pp}^{(m)}, C_{pp}^{(n)}) into hyperrectangular boxes. The magnitude of the TE may be biased by the particular choice of binning scheme, so we compute TE across a number of different box sizes, determined as follows.
Let L be the number of observations in S (and T and C). The coarsest box size is determined by subdiving the i-th coordinate axis into N = ceiling(L^\frac{1}{k + l + m + n + 1}) intervals of equal lengths, resulting in a box size of |max(dim_{i}) - min(dim_{i})|/N. The next box size is given by |max(dim_{i}) - min(dim_{i})|/(N+1), then |max(dim_{i}) - min(dim_{i})|/(N+2), and so on, until the finest box size which is given by |max(dim_{i}) - min(dim_{i})|/(N+N_{subdivs}).
Transfer entropy computation
TE is then computed as
using the provided estimator (default = VisitationFrequency) for  each of the discretizations. A vector of the TE estimates for each discretization  is returned.
te_cond(source::AbstractArray{<:Real, 1}, 
    response::AbstractArray{<:Real, 1},
    cond::AbstractArray{<:Real, 1},
    k::Int, l::Int, m::Int, n::Int,
    binning_scheme::Vector{RectangularBinning}; 
    η = 1, τ = 1, 
    estimator = VisitationFrequency(b = 2))
Conditional TE with default discretization scheme(s
Calculate transfer entropy from source to response conditioned on cond using the provided  estimator on a rectangular discretization of a k + l + m + n-dimensional  delay embedding of the input data, using an embedding delay of τ across  all embedding components. η is the prediction lag. 
Arguments
- source: The source data series.
- target: The target data series.
- cond: The conditional data series.
- k: The dimension of the T_{f} component of the embedding.
- l: The dimension of the T_{pp} component of the embedding.
- m: The dimension of the S_{pp} component of the embedding.
- n: The dimension of the C_{pp} component of the embedding.
- binning_scheme: The binning scheme(s) used to construct the partitions over which TE is computed. Must be either one or several instances of- RectangularBinnings (provided as a vector). TE is computed for each of the resulting partitions.
Keyword arguments
- τ: The embedding lag. Default is- τ = 1.
- η: The prediction lag. Default is- η = 1.
- estimator: The transfer entropy estimator to use. The default is- VisitationFrequency().
More about the embedding
To compute transfer entropy, we need an appropriate delay embedding  of source (S), target (T) and cond (C). For convenience, define 
where T_f denotes the k-dimensional set of vectors furnishing the future states of T, T_{pp} denotes the l-dimensional set of vectors furnishing the past and present states of T,  ,S_{pp} denotes the m-dimensional set of vectors furnishing the past and present of S,  and C_{pp} denotes the n-dimensional set of vectors furnishing the past and present of C. \eta is the prediction lag. This convenience function uses \tau_1 = τ,  \tau_2 = 2*τ, \tau_3 = 3*τ, and so on.
Combined, we get the generalised embedding \mathbb{E} = (T_f^{(k)}, T_{pp}^{(l)}, S_{pp}^{(m)}, C_{pp}^{(n)}), which is discretized as described below.
More about discretization
The discretization scheme must be either a single RectangularBinning instance, or a vector of  RectangularBinning instances. Run ?RectangularBinning after loading CausalityTools for  details.
Transfer entropy computation
TE is then computed as
using the provided estimator (default = VisitationFrequency) for  each of the discretizations. A vector of the TE estimates for each discretization  is returned.