Numerical Data
The word "timeseries" can be confusing, because it can mean a univariate (also called scalar or one-dimensional) timeseries or a multivariate (also called multi-dimensional) timeseries. To resolve this confusion, in DynamicalSystems.jl we have the following convention: "timeseries" always refers to a one-dimensional vector of numbers, which exists with respect to some other one-dimensional vector of numbers that corresponds to a time-vector. On the other hand, the word "trajectory" is used to refer to a multi-dimensional timeseries, which is of course simply a group/set of one-dimensional timeseries. A trajectory is represented by a Dataset
and is the return type of trajectory
.
Trajectories (and in general sets in state space) in DynamicalSystems.jl are represented by a structure called Dataset
(while timeseries are standard Julia Vector
s).
DelayEmbeddings.Dataset
— TypeDataset{D, T} <: AbstractDataset{D,T}
A dedicated interface for datasets. It contains equally-sized datapoints of length D
, represented by SVector{D, T}
.
When indexed with 1 index, a dataset
is like a vector of datapoints. When indexed with 2 indices it behaves like a matrix that has each of the columns be the timeseries of each of the variables.
Dataset
also supports most sensible operations like append!, push!, hcat, eachrow
, among others.
Description of indexing
In the following let i, j
be integers, typeof(data) <: AbstractDataset
and v1, v2
be <: AbstractVector{Int}
(v1, v2
could also be ranges).
data[i]
gives thei
th datapoint (returns anSVector
)data[v1]
will return a vector of datapointsdata[v1, :]
using aColon
as a second index will return aDataset
of these pointsdata[:, j]
gives thej
th variable timeseries, asVector
data[v1, v2]
returns aDataset
with the appropriate entries (first indices being "time"/point index, while second being variables)data[i, j]
value of thej
th variable, at thei
th timepoint
Use Matrix(dataset)
or Dataset(matrix)
to convert. It is assumed that each column of the matrix
is one variable. If you have various timeseries vectors x, y, z, ...
pass them like Dataset(x, y, z, ...)
. You can use columns(dataset)
to obtain the reverse, i.e. all columns of the dataset in a tuple.
In essence a Dataset
is simply a wrapper for a Vector
of SVector
s. However, it is visually represented as a matrix, similarly to how numerical data would be printed on a spreadsheet (with time being the column direction). It also offers a lot more functionality than just pretty-printing. Besides the examples in the documentation string, you can also do:
using DynamicalSystems
hen = Systems.henon()
data = trajectory(hen, 10000) # this returns a dataset
for point in data
# do stuff with each data-point
# (vector with as many elements as system dimension)
end
Most functions from DynamicalSystems.jl that manipulate and use data are expecting a Dataset
. This allows us to define efficient methods that coordinate well with each other, like e.g. embed
.
Dataset Functions
Functions that operate on datasets.
DelayEmbeddings.minima
— Functionminima(dataset)
Return an SVector
that contains the minimum elements of each timeseries of the dataset.
DelayEmbeddings.maxima
— Functionmaxima(dataset)
Return an SVector
that contains the maximum elements of each timeseries of the dataset.
DelayEmbeddings.minmaxima
— Functionminmaxima(dataset)
Return minima(dataset), maxima(dataset)
without doing the computation twice.
DelayEmbeddings.columns
— Functioncolumns(dataset) -> x, y, z, ...
Return the individual columns of the dataset.
Dataset I/O
Input/output functionality for an AbstractDataset
is already achieved using base Julia, specifically writedlm
and readdlm
. To write and read a dataset, simply do:
using DelimitedFiles
data = Dataset(rand(1000, 2))
# I will write and read using delimiter ','
writedlm("data.txt", data, ',')
# Don't forget to convert the matrix to a Dataset when reading
data = Dataset(readdlm("data.txt", ',', Float64))
Neighborhoods
Combining the excellent performance of NearestNeighbors.jl with the AbstractDataset
allows us to define a function that calculates a "neighborhood" of a given point, i.e. finds other points near it. The different "types" of the neighborhoods are subtypes of AbstractNeighborhood
.
DelayEmbeddings.neighborhood
— Functionneighborhood(point, tree, ntype)
neighborhood(point, tree, ntype, n::Int, w::Int = 1)
Return a vector of indices which are the neighborhood of point
in some data
, where the tree
was created using tree = KDTree(data [, metric])
. The ntype
is the type of neighborhood and can be any subtype of AbstractNeighborhood
.
Use the second method when the point
belongs in the data, i.e. point = data[n]
. Then w
stands for the Theiler window (positive integer). Only points that have index abs(i - n) ≥ w
are returned as a neighborhood, to exclude close temporal neighbors. The default w=1
is the case of excluding the point
itself.
References
neighborhood
simply interfaces the functions NearestNeighbors.knn
and inrange
from NearestNeighbors.jl by using the argument ntype
.
DelayEmbeddings.AbstractNeighborhood
— TypeAbstractNeighborhood
Supertype of methods for deciding the neighborhood of points for a given point.
Concrete subtypes:
FixedMassNeighborhood(K::Int)
: The neighborhood of a point consists of theK
nearest neighbors of the point.FixedSizeNeighborhood(ε::Real)
: The neighborhood of a point consists of all neighbors that have distance <ε
from the point.
See neighborhood
for more.