Numerical Data

StateSpaceSetsModule

StateSpaceSets.jl

docsdevdocsstableCIcodecov

A Julia package that provides functionality for state space sets. These are ordered collections of points of fixed length (called dimension). It is used by many other packages in the JuliaDynamics organization. The main export of StateSpaceSets is the concrete type StateSpaceSet. The package also provides functionality for distances, neighbor searches, sampling, and normalization.

To install it you may run import Pkg; Pkg.add("StateSpaceSets"), however, there is no real reason to install this package directly as it is re-exported by all downstream packages that use it.

source
Timeseries and datasets

The word "timeseries" can be confusing, because it can mean a univariate (also called scalar or one-dimensional) timeseries or a multivariate (also called multi-dimensional) timeseries. To resolve this confusion, in DynamicalSystems.jl we have the following convention: "timeseries" is always univariate! it refers to a one-dimensional vector of numbers, which exists with respect to some other one-dimensional vector of numbers that corresponds to a time vector. On the other hand, we use the word "state space set" to refer to a multi-dimensional timeseries, which is of course simply a group/set of one-dimensional timeseries represented as a StateSpaceSet.

StateSpaceSet

Trajectories, and in general sets in state space, are represented by a structure called StateSpaceSet in DynamicalSystems.jl (while timeseries are always standard Julia Vectors). It is recommended to always standardize datasets.

StateSpaceSets.StateSpaceSetType
StateSpaceSet{D, T, V} <: AbstractVector{V}

A dedicated interface for sets in a state space. It is an ordered container of equally-sized points of length D, with element type T, represented by a vector of type V. Typically V is SVector{D,T} or Vector{T} and the data are always stored internally as Vector{V}. SSSet is an alias for StateSpaceSet.

The underlying Vector{V} can be obtained by vec(ssset), although this is almost never necessary because StateSpaceSet subtypes AbstractVector and extends its interface. StateSpaceSet also supports almost all sensible vector operations like append!, push!, hcat, eachrow, among others. When iterated over, it iterates over its contained points.

Construction

Constructing a StateSpaceSet is done in three ways:

  1. By giving in each individual columns of the state space set as Vector{<:Real}: StateSpaceSet(x, y, z, ...).
  2. By giving in a matrix whose rows are the state space points: StateSpaceSet(m).
  3. By giving in directly a vector of vectors (state space points): StateSpaceSet(v_of_v).

All constructors allow for the keyword container which sets the type of V (the type of inner vectors). At the moment options are only SVector, MVector, or Vector, and by default SVector is used.

Description of indexing

When indexed with 1 index, StateSpaceSet behaves exactly like its encapsulated vector. i.e., a vector of vectors (state space points). When indexed with 2 indices it behaves like a matrix where each row is a point.

In the following let i, j be integers, typeof(X) <: AbstractStateSpaceSet and v1, v2 be <: AbstractVector{Int} (v1, v2 could also be ranges, and for performance benefits make v2 an SVector{Int}).

  • X[i] == X[i, :] gives the ith point (returns an SVector)
  • X[v1] == X[v1, :], returns a StateSpaceSet with the points in those indices.
  • X[:, j] gives the jth variable timeseries (or collection), as Vector
  • X[v1, v2], X[:, v2] returns a StateSpaceSet with the appropriate entries (first indices being "time"/point index, while second being variables)
  • X[i, j] value of the jth variable, at the ith timepoint

Use Matrix(ssset) or StateSpaceSet(matrix) to convert. It is assumed that each column of the matrix is one variable. If you have various timeseries vectors x, y, z, ... pass them like StateSpaceSet(x, y, z, ...). You can use columns(dataset) to obtain the reverse, i.e. all columns of the dataset in a tuple.

source

In essence a StateSpaceSet is simply a wrapper for a Vector of SVectors. However, it is visually represented as a matrix, similarly to how numerical data would be printed on a spreadsheet (with time being the column direction). It also offers a lot more functionality than just pretty-printing. Besides the examples in the documentation string, you can e.g. iterate over data points

using DynamicalSystems
hen = Systems.henon()
data = trajectory(hen, 10000) # this returns a dataset
for point in data
    # stuff
end

Most functions from DynamicalSystems.jl that manipulate ors use multidimensional data are expecting a StateSpaceSet.

StateSpaceSet accesses

StateSpaceSets.minimaFunction
minima(dataset)

Return an SVector that contains the minimum elements of each timeseries of the dataset.

source
StateSpaceSets.maximaFunction
maxima(dataset)

Return an SVector that contains the maximum elements of each timeseries of the dataset.

source
StateSpaceSets.columnsFunction
columns(ssset) -> x, y, z, ...

Return the individual columns of the state space set allocated as Vectors. Equivalent with collect(eachcol(ssset)).

source

Basic statistics

StateSpaceSets.standardizeFunction
standardize(d::StateSpaceSet) → r

Create a standardized version of the input set where each column is transformed to have mean 0 and standard deviation 1.

source
standardize(x::AbstractVector{<:Real}) = (x - mean(x))/std(x)
source
Statistics.corFunction
cor(d::StateSpaceSet) → m::SMatrix

Compute the corrlation matrix m from the columns of d, where m[i, j] is the correlation between d[:, i] and d[:, j].

source
Statistics.covFunction
cov(d::StateSpaceSet) → m::SMatrix

Compute the covariance matrix m from the columns of d, where m[i, j] is the covariance between d[:, i] and d[:, j].

source
StateSpaceSets.mean_and_covFunction
mean_and_cov(d::StateSpaceSet) → μ, m::SMatrix

Return a tuple of the column means μ and covariance matrix m.

Column means are always computed for the covariance matrix, so this is faster than computing both quantities separately.

source

StateSpaceSet distances

Two datasets

StateSpaceSets.set_distanceFunction
set_distance(ssset1, ssset2 [, distance])

Calculate a distance between two StateSpaceSets, i.e., a distance defined between sets of points, as dictated by distance.

Possible distance types are:

source
StateSpaceSets.HausdorffType
Hausdorff(metric = Euclidean())

A distance that can be used in set_distance. The Hausdorff distance is the greatest of all the distances from a point in one set to the closest point in the other set. The distance is calculated with the metric given to Hausdorff which defaults to Euclidean.

Hausdorff is 2x slower than StrictlyMinimumDistance, however it is a proper metric in the space of sets of state space sets.

This metric only works for StateSpaceSets whose elements are SVectors.

source
StateSpaceSets.CentroidType
Centroid(metric = Euclidean())

A distance that can be used in set_distance. The Centroid method returns the distance (according to metric) between the centroids (a.k.a. centers of mass) of the sets.

metric can be any function that takes in two static vectors are returns a positive definite number to use as a distance (and typically is a Metric from Distances.jl).

source
StateSpaceSets.StrictlyMinimumDistanceType
StrictlyMinimumDistance([brute = false,] [metric = Euclidean(),])

A distance that can be used in set_distance. The StrictlyMinimumDistance returns the minimum distance of all the distances from a point in one set to the closest point in the other set. The distance is calculated with the given metric.

The brute::Bool argument switches the computation between a KDTree-based version, or brute force (i.e., calculation of all distances and picking the smallest one). Brute force performs better for datasets that are either large dimensional or have a small amount of points. Deciding a cutting point is not trivial, and is recommended to simply benchmark the set_distance function to make a decision.

If brute = false this metric only works for StateSpaceSets whose elements are SVectors.

source

Sets of datasets

StateSpaceSets.setsofsets_distancesFunction
setsofsets_distances(a₊, a₋ [, distance]) → distances

Calculate distances between sets of StateSpaceSets. Here a₊, a₋ are containers of StateSpaceSets, and the returned distances are dictionaries of distances. Specifically, distances[i][j] is the distance of the set in the i key of a₊ to the j key of a₋. Distances from a₋ to a₊ are not computed at all, assumming symmetry in the distance function.

The distance can be anything valid for set_distance.

source

StateSpaceSet I/O

Input/output functionality for an AbstractStateSpaceSet is already achieved using base Julia, specifically writedlm and readdlm. To write and read a dataset, simply do:

using DelimitedFiles

data = StateSpaceSet(rand(1000, 2))

# I will write and read using delimiter ','
writedlm("data.txt", data, ',')

# Don't forget to convert the matrix to a StateSpaceSet when reading
data = StateSpaceSet(readdlm("data.txt", ',', Float64))

Neighborhoods

Neighborhoods refer to the common act of finding points in a dataset that are nearby a given point (which typically belongs in the dataset). DynamicalSystems.jl bases this interface on Neighborhood.jl. You can go to its documentation if you are interested in finding neighbors in a dataset for e.g. a custom algorithm implementation.

For DynamicalSystems.jl, what is relevant are the two types of neighborhoods that exist:

Neighborhood.NeighborNumberType
NeighborNumber(k::Int) <: SearchType

Search type representing the k nearest neighbors of the query (or approximate neighbors, depending on the search structure).

source
Neighborhood.WithinRangeType
WithinRange(r::Real) <: SearchType

Search type representing all neighbors with distance ≤ r from the query (according to the search structure's metric).

source

Samplers

StateSpaceSets.statespace_samplerFunction
statespace_sampler(region [, seed = 42]) → sampler, isinside

A function that facilitates sampling points randomly and uniformly in a state space region. It generates two functions:

  • sampler is a 0-argument function that when called generates a random point inside a state space region. The point is always a Vector for type stability irrespectively of dimension. Generally, the generated point should be copied if it needs to be stored. (i.e., calling sampler() utilizes a shared vector) sampler is a thread-safe function.
  • isinside is a 1-argument function that returns true if the given state space point is inside the region.

The region can be an instance of any of the following types (input arguments if not specified are vectors of length D, with D the state space dimension):

  • HSphere(radius::Real, center): points inside the hypersphere (boundary excluded). Convenience method HSphere(radius::Real, D::Int) makes the center a D-long vector of zeros.
  • HSphereSurface(radius, center): points on the hypersphere surface. Same convenience method as above is possible.
  • HRectangle(mins, maxs): points in [min, max) for the bounds along each dimension.

The random number generator is always Xoshiro with the given seed.

source
statespace_sampler(grid::NTuple{N, AbstractRange} [, seed])

If given a grid that is a tuple of AbstractVectors, the minimum and maximum of the vectors are used to make an HRectangle region.

source
StateSpaceSets.HSphereType
HSphere(r::Real, center::AbstractVector)
HSphere(r::Real, D::Int)

A state space region denoting all points within a hypersphere.

source
StateSpaceSets.HSphereSurfaceType
HSphereSurface(r::Real, center::AbstractVector)
HSphereSurface(r::Real, D::Int)

A state space region denoting all points on the surface (boundary) of a hypersphere.

source
StateSpaceSets.HRectangleType
HRectangle(mins::AbstractVector, maxs::AbstractVector)

A state space region denoting all points within the hyperrectangle.

source