Low-level transfer entropy estimators
For complete control over the estimation procedure, the analyst must create a delay reconstruction from the input data, specify a discretization scheme which can be either rectangular or triangulated, and map variables of the delay reconstruction to the correct marginals.
The package provides some convenience methods to compute TE directly from time series, see the wrappers for regular TE and conditional TE. However, be absolutely sure that you understand what they do before applying them to real problems.
Estimators
Valid estimator types for rectangular partitions are
VisitationFrequency
. An implementation of the original TE estimator from Schreiber (2000)1TransferOperatorGrid
. An implementation of the transfer operator grid estimator from Diego et al. (2019)2
General workflow for TE estimation
The general workflow for estimating transfer entropy over rectangular partitions is as follows.
Rectangular partitions
To estimate transfer entropy over rectangular partitions, you would use the following method, providing either a VisitationFrequency
or TransferOperatorGrid
instance to the estimator
argument.
#
TransferEntropy.transferentropy
— Method.
transferentropy(pts, vars::TEVars, ϵ::RectangularBinning,
estimator::TransferEntropyEstimator; b = 2)
Transfer entropy using a rectangular partition
Estimate transfer entropy on a rectangular partition over a set of points pts
, which represent a generalised embedding of data series x, y and (potentially) z of the form outlined below.
Note: pts
must be a vector of states, not a vector of variables/(time series). Wrap your time series in a Dataset
first if the latter is the case.
Estimators (and their acronyms)
- Visitation frequency estimator:
VisitationFrequency
; an instance must be provided. - Transfer operator grid estimator:
TransferOperatorGrid
; an instance must be provided.
Relationship between pts
and vars
pts
should be an embedding of the form (y(t + \eta)^{k}, y(t)^{l}, x(t)^{m}, z(t)^{n}. Here, y(t + \eta)^{k}) indicates that k future states of y
should be included, y(t)^{l} indicates that l present/past states of y should be included, x(t)^{m} indicates that m present/past states of x should be included, and z(t)^{n} indicates that n present/past states of the variable we're conditioning on should be included in the embedding vectors. Thus, the total dimension of the embedding space will be k + l + m + n.
vars
is a TEVars
instance contain the instruction on which variables of the embedding will be treated as part of which marginals during transfer entropy computation. Check the documentation of TEVars
for more details.
Example
1. Time series
We'll generate two 80-point long realizations, x and y, of two 1D logistic maps starting at different initial conditions.
sys1 = DynamicalSystems.Systems.logistic()
sys2 = DynamicalSystems.Systems.logistic()
x = trajectory(sys1, 80, Ttr = 1000);
y = trajectory(sys1, 80, Ttr = 1000);
# Wrap the time series in a dataset containing the states of the
# composite system.
D = Dataset(x, y)
2. Generalised embedding
Say we want to compute transfer entropy from x to y, and that we require a 4-dimensional embedding. We do an appropriate delay reconstruction of the data (E = \{S_{pp}, T_{pp}, T_f \}= \{x_t, (y_t, y_{t-\tau}), y_{t+\eta} \}), so that we're computing the following TE
To create the embedding, we'll use the customembed
function (check its documentation for a detailed explanation on how it works).
# Embed the data, putting time series in the 2nd column (y) of `data` in the
# first three embedding columns, lagging them with lags (η, 0, -τ), and
# putting the 1st column of `data` (x) in the last position of the embedding,
# not lagging it.
τ = optimal_delay(y) # embedding lag
η = 2 # prediction lag
pts = customembed(D, Positions(2, 2, 2, 1), Lags(η, 0, -τ, 0))
3. Instructions to the estimator
Now, tell the estimator how to relative the dynamical variables of the generalised embedding to the marginals in the transfer entropy computation.
vars = TEVars(Tf = [1], Tpp = [2, 3], Spp = [4])
4. Rectangular grid specification
We'll compute transfer entropy using the visitation frequency estimator over a rectangular partition where the box sizes are determined by splitting each coordinate axis into 6 equally spaced intervals each.
binning = RectangularBinning(6)
5. Compute transfer entropy
# Over a single rectangular grid
transferentropy(pts, vars, binning, VisitationFrequency()) #, or
transferentropy(pts, vars, binning, TransferOperatorGrid())
# Over multiple cubic grids with differing box sizes
# logarithmically spaced from edge length 0.001 to 0.3
ϵs = 10 .^ range(log(10, 0.001), log10(0.3), length = 15)
map(ϵ -> transferentropy(pts, vars, RectangularBinning(ϵ), VisitationFrequency())) #, or
map(ϵ -> transferentropy(pts, vars, RectangularBinning(ϵ), TransferOperatorGrid()))
Triangulated partitions
Estimators for computing TE on triangulated partitions, whose invariant distribution is obtained through the transfer operator, was also introduced in Diego et al. (2019)2.
#
TransferEntropy.transferentropy
— Method.
transferentropy(μ::AbstractTriangulationInvariantMeasure, vars::TEVars,
binning_scheme::RectangularBinning;
estimator = VisitationFrequency(), n::Int = 10000, b = 2)
Transfer entropy using a precomputed invariant measure over a triangulated partition
Estimate transfer entropy from an invariant measure over a triangulation that has been precomputed either as
μ = invariantmeasure(pts, TriangulationBinning(), ApproximateIntersection())
, orμ = invariantmeasure(pts, TriangulationBinning(), ExactIntersection())
where the first method uses approximate simplex intersections (faster) and the second method uses exact simplex intersections (slow). μ
contains all the information needed to compute transfer entropy.
Note: pts
must be a vector of states, not a vector of variables/(time series). Wrap your time series in a Dataset
first if the latter is the case.
Computing transfer entropy (triangulation -> rectangular partition)
Because we need to compute marginals, we need a rectangular grid. To do so, transfer entropy is computed by sampling the simplices of the triangulation according to their measure with a total of approximately n
points. Introducing multiple points as representatives for the partition elements does not introduce any bias, because we in computing the invariant measure, we use no more information than what is encoded in the dynamics of the original data points. However, from the invariant measure, we can get a practically infinite amount of points to estimate transfer entropy from.
Then, transfer entropy is estimated using the visitation frequency estimator on those points (see docs for transferentropy_visitfreq
for more information), on a rectangular grid specified by binning_scheme
.
Common use case
This method is good to use if you want to explore the sensitivity of transfer entropy to the bin size in the final rectangular grid, when you have few observations in the time series. The invariant measure, which encodes the dynamical information, is slow to compute over the triangulation, but only needs to be computed once. After that, transfer entropy may be estimated at multiple scales very quickly.
Example
# Assume these points are an appropriate delay embedding {(x(t), y(t), y(t+1))} and
# that we're measure transfer entropy from x -> y.
pts = invariantize([rand(3) for i = 1:30])
v = TEVars(Tf = [3], Tpp = [2], Spp = [1])
# Compute invariant measure over a triangulation using approximate
# simplex intersections. This is relatively slow.
μ = invariantmeasure(pts, TriangulationBinning(), ApproximateIntersection())
# Compute transfer entropy from the invariant measure over multiple
# bin sizes. This is fast, because the measure has been precomputed.
tes = map(ϵ -> transferentropy(μ, v, RectangularBinning(ϵ)), 2:50)