RecurrenceMicrostatesAnalysis.jl for devs
All pull requests that introduce new functionality must be thoroughly tested and documented. Tests are required only for the methods you extend. Always remember to add docstrings to your implementations, as well as tests to validate them.
We recommend reading the Good Scientific Code Workshop
Backend of RecurrenceMicrostatesAnalysis.jl
RecurrenceMicrostatesAnalysis.jl has multiple backends, depending on the usage context. Each backend is implemented based on an internal struct RMACore or on the input data type.
There are three backend implementations:
For sequential data using the CPU: it internally uses a
CPUCore <: RMACore, which defines that the process must run on the CPU, and the input data is aStateSpaceSet(or aVector{<:Real}).For spatial data using the CPU: it internally uses a
CPUCore <: RMACore, which defines that the process must run on the CPU, and the input data is anArray.For sequential data using the GPU: it internally uses a
GPUCore <: RMACore, which defines that the process must run on the GPU, and the input data is anAbstractGPUVector{<:SVector}.
Note that there is a significant difference between the input data types, in such a way that RMACore is just an auxiliary struct used to differentiate the hardware backend internally.
The package backends are located at src/core/cpu_core.jl (CPU) and src/core/gpu/gpu_core.jl (GPU).
You do not need to fully understand how the backend operates to define something new in the package. However, it is important to understand how the backend requires internal functions based on RMACore or the input data type. For example, when implementing a RecurrenceExpression, the CPU structure uses a recurrence function, while the GPU structure uses a gpu_recurrence function.
Adding a new Recurrence Expression
Steps
- Define the mathematical expression of your recurrence expression. It must return a binary value:
UInt(0)for non-recurrence andUInt(1)for recurrence. - Define a new type
YourType <: RecurrenceExpression. Constant parameters (e.g., recurrence threshold and metric) should be fields of this type. - Implement the appropriate
recurrencedispatch:- Sequential data:
recurrence(expr::YourType, x::StateSpaceSet, y::StateSpaceSet, i::Int, j::Int) - Spatial data:
recurrence(expr::YourType, x::AbstractArray{<:Real}, y::AbstractArray{<:Real}, i::NTuple{N,Int}, j::NTuple{M,Int}) - GPU:
gpu_recurrence(expr::YourType, x, y, i, j, n)
- Sequential data:
- Add a docstring describing the mathematical definition and relevant references.
- Add the recurrence expression to
docs/src/api.md. - Add the expression to the
RecurrenceExpressiondocstring. - Add tests to
test/distributions.jlandtest/recurrences.jl.
Example
Let's define a "recurrence" expression as:
\[r_{(i,j)} = \Theta(\|\vec{x}_i - \vec{x}_j\| - \varepsilon).\]
First, we define our struct:
using RecurrenceMicrostatesAnalysis
using Distances: Euclidean, Metric, evaluate
struct MyRecurrenceExpr{T <: Real, M <: Metric} <: RecurrenceExpression{T, M}
ε::T
metric::M
end
MyRecurrenceExpr(ε) = MyRecurrenceExpr(ε, Euclidean())Main.MyRecurrenceExprNext, we define the recurrence function:
@inline function RecurrenceMicrostatesAnalysis.recurrence(
expr::MyRecurrenceExpr,
x::StateSpaceSet,
y::StateSpaceSet,
i::Int,
j::Int,
)
distance = @inbounds evaluate(expr.metric, x.data[i], y.data[j])
return UInt8(distance ≥ expr.ε)
endAnd that's it:
rmspace = RecurrenceMicrostates(MyRecurrenceExpr(0.27), 3)RecurrenceMicrostates, with 3 fields:
shape = RecurrenceMicrostatesAnalysis.Rect2Microstate{3, 3, 2}()
expr = Main.MyRecurrenceExpr{Float64, Euclidean}(0.27, Euclidean(0.0))
sampling = SRandom{Float64}(0.05)
X = randn(1000)
probabilities(rmspace, X) Probabilities{Float64,1} over 512 outcomes
1 0.0001405594265175398
2 6.023975422180277e-5
3 2.007991807393426e-5
4 2.007991807393426e-5
5 2.007991807393426e-5
6 0.0
7 2.007991807393426e-5
8 0.0003815184434047509
9 0.0
10 2.007991807393426e-5
⋮
504 0.036003293106564124
505 0.0025702295134635853
506 0.009297002068231561
507 0.00773076845846469
508 0.037971125077809684
509 0.008975723379048613
510 0.037288407863295917
511 0.03624425212345134
512 0.2602959779924098Adding a new Sampling Mode
Steps
- Define how the sampling mode operates: which microstates are sampled, from which regions, and in what quantity. The
SamplingSpacemust be taken into account when designing the sampling logic. - Define a new struct that is a subtype of
SamplingMode. The struct may be empty (e.g.,Full) or contain parameters such as a sampling ratio (e.g.,SRandom). - Implement the dispatch
get_num_samples(mode::YourType, ::SamplingSpace), which determines the number of samples to be drawn given the sampling mode and the sampling space. - Implement the dispatch
get_sample(::RMACore, ::YourType, space::SamplingSpace, rng, m), which returns the starting pair $(i, j)$ to construct the microstate. Here,RMACoredefines whether it is running on the CPU or GPU,rngis a random number generator, andmis a counter of microstates. - Add a docstring to your sampling mode describing its behavior and initialization.
- Add your sampling mode to
docs/src/api.md. - Add the expression to the
SamplingModedocstring. - Add tests to
test/distributions.jlandtest/sampling.jl.
Adding a new Microstate Shape
Steps
- Define your microstate design. It essentially determines the microstate structure and reading order. For example, square microstates are read row-wise, while triangular microstates may be read column-wise. Each position in the microstate structure must be associated with a power of two in order to convert the binary microstate into a decimal index.
- Define a new struct that is a subtype of
MicrostateShape. - Implement the dispatch
get_histogram_size(::MyShape), which returns the histogram length. - Implement the dispatch
get_power_vector(::RMACore, ::MyShape), which returns the power vector used to read the microstate as an integer. - Implement the dispatch
get_offsets(::RMACore, ::MyShape), which returns which positions are accessed from $(i, j)$ to construct the microstate. - Define how your shape reacts to a
SamplingSpaceby implementingSamplingSpace(::MyType, x, y). - Add a docstring to your microstate shape describing its behavior and initialization.
- Add your microstate shape to
docs/src/api.md. - Add the expression to the
MicrostateShapedocstring. - Add tests to
test/distributions.jlandtest/shapes.jl.
Example
As an example, let's try to construct the struct to obtain the microstate:
\[\begin{matrix} r_{(i,j)} & & r_{(i, j+2)} \\ & r_{(i+1, j+1)} & \\ r_{(i+2,j)} & & r_{(i+2, j+2)} \end{matrix}\]
For simplicity, we are not going to generalize it for arbitrary sizes 😅. First, let's define our struct:
struct MyMicrostateShape <: MicrostateShape endNext, we need to define the histogram size, which will be returned by get_histogram_size. It is $2^\sigma$, where $\sigma$ is the number of recurrences contained within the microstate shape. For a square microstate we have $\sigma = N^2$, and for a triangular microstate $\sigma = N(N - 1)\div 2$. In our example, it is simple since our shape is fixed: $\sigma = 5$, so:
RecurrenceMicrostatesAnalysis.get_histogram_size(::MyMicrostateShape) = 2^5The next step is to define our power vector. It determines how we read our microstate as an integer. Each position should be associated with a power of 2:
\[\begin{matrix} 2^0r_{(i,j)} & & 2^1r_{(i, j+2)} \\ & 2^2r_{(i+1, j+1)} & \\ 2^3r_{(i+2,j)} & & 2^4r_{(i+2, j+2)} \end{matrix}\]
Then, we write:
RecurrenceMicrostatesAnalysis.get_power_vector(::RecurrenceMicrostatesAnalysis.CPUCore, ::MyMicrostateShape) = SVector{5, Int}([1, 2, 4, 8, 16])Finally, we need to define the set of offsets used to construct the microstate from the trajectory. Note that each offset must have the same index as the corresponding element in the power vector.
function RecurrenceMicrostatesAnalysis.get_offsets(::RecurrenceMicrostatesAnalysis.CPUCore, ::MyMicrostateShape)
elems = [
SVector{2, Int}([0, 0]),
SVector{2, Int}([0, 2]),
SVector{2, Int}([1, 1]),
SVector{2, Int}([2, 0]),
SVector{2, Int}([2, 2])
]
return SVector{5, SVector{2, Int}}(elems)
endNow, we just need to define how our microstate will behave with respect to a sampling space. It is not necessary to define it for all available sampling spaces, but you need to do so for at least one of them.
RecurrenceMicrostatesAnalysis.SamplingSpace(
::MyMicrostateShape,
x::StateSpaceSet,
y::StateSpaceSet
) = RecurrenceMicrostatesAnalysis.SSRect2(length(x) - 2, length(y) - 2)And that's it, we can now use our new microstate shape 🙂 (and why not combine it with the previous recurrence expression?!)
rmspace = RecurrenceMicrostates(MyRecurrenceExpr(0.27), MyMicrostateShape())RecurrenceMicrostates, with 3 fields:
shape = Main.MyMicrostateShape()
expr = Main.MyRecurrenceExpr{Float64, Euclidean}(0.27, Euclidean(0.0))
sampling = SRandom{Float64}(0.05)
probabilities(rmspace, X) Probabilities{Float64,1} over 32 outcomes
1 0.00046183811570048793
2 0.00016063934459147408
3 0.0002811188530350796
4 0.0028915082026465333
5 0.0023694303327242423
6 0.0011847151663621212
7 0.0010039959036967129
8 0.015762735688038394
9 0.00036143852533081667
10 0.0027308688580550593
⋮
24 0.07509889359651413
25 0.002871428284572599
26 0.012570028714282845
27 0.012650348386578582
28 0.07891407803056164
29 0.016184413967591012
30 0.07542017228569707
31 0.0753197726953274
32 0.45213951527077767Adding a new quantity estimator
Since RecurrenceMicrostatesAnalysis.jl uses the same structure as ComplexityMeasures.jl to estimate or measure complexity values (e.g., determinism, disorder, etc.), the method to implement new features is very similar.
To add new quantity estimators, refer to the ComplexityMeasures.jl Dev Docs.
Tests for new quantifiers implemented on RecurrenceMicrostatesAnalysis.jl need to be add to test/rqa.jl.
Adding a new GPU metric
Due to the incompatibility of Distances.jl with GPUs, it may be necessary to redefine some metrics to use them with the RecurrenceMicrostatesAnalysis.jl GPU backend.
Steps
- Define a new type that is a subtype of
GPUMetric. - Implement the dispatch
gpu_evaluate(::YourMetric, x, y, i, j, n). Here,iandjindicate which positions of theAbstractGPUVectorare accessed (iforx, andjfory), andnis the number of dimensions of the system. - Add a docstring to your metric describing it.
- Add your metric to
docs/src/api.md. - Add the expression to the
GPUMetricdocstring. - Add tests to
test/utils.jl.