Safe Haskell	None
Language	Haskell2010

Data.Metrics.Reservoir.ExponentiallyDecaying

Description

A histogram with an exponentially decaying reservoir produces quantiles which are representative of (roughly) the last five minutes of data. It does so by using a forward-decaying priority reservoir with an exponential weighting towards newer data. Unlike the uniform reservoir, an exponentially decaying reservoir represents recent data, allowing you to know very quickly if the distribution of the data has changed. Timers use histograms with exponentially decaying reservoirs by default.

Documentation

data ExponentiallyDecayingReservoir Source #

A forward-decaying priority reservoir

http://dimacs.rutgers.edu/~graham/pubs/papers/fwddecay.pdf

Instances

Show ExponentiallyDecayingReservoir Source #
Methods showsPrec :: Int -> ExponentiallyDecayingReservoir -> ShowS # show :: ExponentiallyDecayingReservoir -> String # showList :: [ExponentiallyDecayingReservoir] -> ShowS #

standardReservoir :: NominalDiffTime -> Seed -> Reservoir Source #

An exponentially decaying reservoir with an alpha value of 0.015 and a 1028 sample cap.

This offers a 99.9% confidence level with a 5% margin of error assuming a normal distribution, and an alpha factor of 0.015, which heavily biases the reservoir to the past 5 minutes of measurements.

reservoir Source #

Arguments

:: Double	alpha value
-> Int	max reservoir size
-> NominalDiffTime	creation time for the reservoir
-> Seed
-> Reservoir

Create a reservoir with a custom alpha factor and reservoir size.

clear :: NominalDiffTime -> ExponentiallyDecayingReservoir -> ExponentiallyDecayingReservoir Source #

Reset the reservoir

size :: ExponentiallyDecayingReservoir -> Int Source #

Get the current size of the reservoir.

snapshot :: ExponentiallyDecayingReservoir -> Snapshot Source #

Get a snapshot of the current reservoir

rescale :: Word64 -> ExponentiallyDecayingReservoir -> ExponentiallyDecayingReservoir Source #

"A common feature of the above techniques—indeed, the key technique that allows us to track the decayed weights efficiently – is that they maintain counts and other quantities based on g(ti − L), and only scale by g(t − L) at query time. But while g(ti −L)/g(t−L) is guaranteed to lie between zero and one, the intermediate values of g(ti − L) could become very large. For polynomial functions, these values should not grow too large, and should be effectively represented in practice by floating point values without loss of precision. For exponential functions, these values could grow quite large as new values of (ti − L) become large, and potentially exceed the capacity of common floating point types. However, since the values stored by the algorithms are linear combinations of g values (scaled sums), they can be rescaled relative to a new landmark. That is, by the analysis of exponential decay in Section III-A, the choice of L does not affect the final result. We can therefore multiply each value based on L by a factor of exp(−α(L′ − L)), and obtain the correct value as if we had instead computed relative to a new landmark L′ (and then use this new L′ at query time). This can be done with a linear pass over whatever data structure is being used."

update Source #

Arguments

:: Double	new sample value
-> NominalDiffTime	time of update
-> ExponentiallyDecayingReservoir
-> ExponentiallyDecayingReservoir

Insert a new sample into the reservoir. This may cause old sample values to be evicted based upon the probabilistic weighting given to the key at insertion time.