Overview of the functionalities
The itsample
function allows to consume all the stream at once and return the sample collected:
using StreamSampling
st = 1:100;
itsample(st, 5)
5-element Vector{Int64}:
35
3
98
20
17
In some cases, one needs to control the updates the ReservoirSampler
will be subject to. In this case you can simply use the fit!
function to update the reservoir:
st = 1:100;
rs = ReservoirSampler{Int}(5);
for x in st
fit!(rs, x)
end
value(rs)
5-element Vector{Int64}:
58
68
84
35
56
If the total number of elements in the stream is known beforehand and the sampling is unweighted, it is also possible to iterate over a StreamSampler
like so
st = 1:100;
ss = StreamSampler{Int}(st, 5, 100);
r = Int[];
for x in ss
push!(r, x)
end
r
5-element Vector{Int64}:
22
28
32
43
96
The advantage of StreamSampler
iterators in respect to ReservoirSampler
is that they require O(1)
memory if not collected, while reservoir techniques require O(k)
memory where k
is the number of elements in the sample.