Safe Haskell | None |
---|
A histogram with an exponentially decaying reservoir produces quantiles which are representative of (roughly) the last five minutes of data. It does so by using a forward-decaying priority reservoir with an exponential weighting towards newer data. Unlike the uniform reservoir, an exponentially decaying reservoir represents recent data, allowing you to know very quickly if the distribution of the data has changed. Timers use histograms with exponentially decaying reservoirs by default.
- data ExponentiallyDecayingReservoir
- standardReservoir :: NominalDiffTime -> Seed -> Reservoir
- reservoir :: Double -> Int -> NominalDiffTime -> Seed -> Reservoir
- clear :: NominalDiffTime -> ExponentiallyDecayingReservoir -> ExponentiallyDecayingReservoir
- size :: ExponentiallyDecayingReservoir -> Int
- snapshot :: ExponentiallyDecayingReservoir -> Snapshot
- rescale :: Word64 -> ExponentiallyDecayingReservoir -> ExponentiallyDecayingReservoir
- update :: Double -> NominalDiffTime -> ExponentiallyDecayingReservoir -> ExponentiallyDecayingReservoir
Documentation
data ExponentiallyDecayingReservoir Source
A forward-decaying priority reservoir
standardReservoir :: NominalDiffTime -> Seed -> ReservoirSource
An exponentially decaying reservoir with an alpha value of 0.015 and a 1028 sample cap.
This offers a 99.9% confidence level with a 5% margin of error assuming a normal distribution, and an alpha factor of 0.015, which heavily biases the reservoir to the past 5 minutes of measurements.
:: Double | alpha value |
-> Int | max reservoir size |
-> NominalDiffTime | creation time for the reservoir |
-> Seed | |
-> Reservoir |
Create a reservoir with a custom alpha factor and reservoir size.
clear :: NominalDiffTime -> ExponentiallyDecayingReservoir -> ExponentiallyDecayingReservoirSource
Reset the reservoir
size :: ExponentiallyDecayingReservoir -> IntSource
Get the current size of the reservoir.
snapshot :: ExponentiallyDecayingReservoir -> SnapshotSource
Get a snapshot of the current reservoir
rescale :: Word64 -> ExponentiallyDecayingReservoir -> ExponentiallyDecayingReservoirSource
"A common feature of the above techniques—indeed, the key technique that allows us to track the decayed weights efficiently – is that they maintain counts and other quantities based on g(ti − L), and only scale by g(t − L) at query time. But while g(ti −L)/g(t−L) is guaranteed to lie between zero and one, the intermediate values of g(ti − L) could become very large. For polynomial functions, these values should not grow too large, and should be effectively represented in practice by floating point values without loss of precision. For exponential functions, these values could grow quite large as new values of (ti − L) become large, and potentially exceed the capacity of common floating point types. However, since the values stored by the algorithms are linear combinations of g values (scaled sums), they can be rescaled relative to a new landmark. That is, by the analysis of exponential decay in Section III-A, the choice of L does not affect the final result. We can therefore multiply each value based on L by a factor of exp(−α(L′ − L)), and obtain the correct value as if we had instead computed relative to a new landmark L′ (and then use this new L′ at query time). This can be done with a linear pass over whatever data structure is being used."
:: Double | new sample value |
-> NominalDiffTime | time of update |
-> ExponentiallyDecayingReservoir | |
-> ExponentiallyDecayingReservoir |
Insert a new sample into the reservoir. This may cause old sample values to be evicted based upon the probabilistic weighting given to the key at insertion time.