streamly-core-0.1.0: Streaming, parsers, arrays and more
Copyright(c) 2020 Composewell Technologies
LicenseApache-2.0
Maintainerstreamly@composewell.com
Stabilityexperimental
PortabilityGHC
Safe HaskellSafe-Inferred
LanguageHaskell2010

Streamly.Internal.Data.Fold.Window

Description

Simple incremental statistical measures over a stream of data. All operations use numerically stable floating point arithmetic.

Measurements can be performed over the entire input stream or on a sliding window of fixed or variable size. Where possible, measures are computed online without buffering the input stream.

Currently there is no overflow detection.

For more advanced statistical measures see the streamly-statistics package.

Synopsis

Incremental Folds

Folds of type Fold m (a, Maybe a) b are incremental sliding window folds. An input of type (a, Nothing) indicates that the input element a is being inserted in the window without ejecting an old value increasing the window size by 1. An input of type (a, Just a) indicates that the first element is being inserted in the window and the second element is being removed from the window, the window size remains the same. The window size can only increase and never decrease.

You can compute the statistics over the entire stream using sliding window folds by keeping the second element of the input tuple as Nothing.

lmap :: (c -> a) -> Fold m (a, Maybe a) b -> Fold m (c, Maybe c) b Source #

Map a function on the incoming as well as outgoing element of a rolling window fold.

>>> lmap f = Fold.lmap (bimap f (f <$>))

cumulative :: Fold m (a, Maybe a) b -> Fold m a b Source #

Convert an incremental fold to a cumulative fold using the entire input stream as a single window.

>>> cumulative f = Fold.lmap (\x -> (x, Nothing)) f

rollingMap :: Monad m => (Maybe a -> a -> Maybe b) -> Fold m (a, Maybe a) (Maybe b) Source #

Apply a pure function on the latest and the oldest element of the window.

>>> rollingMap f = FoldW.rollingMapM (\x y -> return $ f x y)

rollingMapM :: Monad m => (Maybe a -> a -> m (Maybe b)) -> Fold m (a, Maybe a) (Maybe b) Source #

Apply an effectful function on the latest and the oldest element of the window.

Sums

length :: (Monad m, Num b) => Fold m (a, Maybe a) b Source #

The number of elements in the rolling window.

This is the \(0\)th power sum.

>>> length = powerSum 0

sum :: forall m a. (Monad m, Num a) => Fold m (a, Maybe a) a Source #

Sum of all the elements in a rolling window:

\(S = \sum_{i=1}^n x_{i}\)

This is the first power sum.

>>> sum = powerSum 1

Uses Kahan-Babuska-Neumaier style summation for numerical stability of floating precision arithmetic.

Space: \(\mathcal{O}(1)\)

Time: \(\mathcal{O}(n)\)

sumInt :: forall m a. (Monad m, Integral a) => Fold m (a, Maybe a) a Source #

The sum of all the elements in a rolling window. The input elements are required to be intergal numbers.

This was written in the hope that it would be a tiny bit faster than sum for Integral values. But turns out that sum is 2% faster than this even for intergal values!

Internal

powerSum :: (Monad m, Num a) => Int -> Fold m (a, Maybe a) a Source #

Sum of the \(k\)th power of all the elements in a rolling window:

\(S_k = \sum_{i=1}^n x_{i}^k\)

>>> powerSum k = lmap (^ k) sum

Space: \(\mathcal{O}(1)\)

Time: \(\mathcal{O}(n)\)

powerSumFrac :: (Monad m, Floating a) => a -> Fold m (a, Maybe a) a Source #

Like powerSum but powers can be negative or fractional. This is slower than powerSum for positive intergal powers.

>>> powerSumFrac p = lmap (** p) sum

Location

minimum :: (MonadIO m, Storable a, Ord a) => Int -> Fold m a (Maybe a) Source #

Find the minimum element in a rolling window.

This implementation traverses the entire window buffer to compute the minimum whenever we demand it. It performs better than the dequeue based implementation in streamly-statistics package when the window size is small (< 30).

If you want to compute the minimum of the entire stream minimum is much faster.

Time: \(\mathcal{O}(n*w)\) where \(w\) is the window size.

maximum :: (MonadIO m, Storable a, Ord a) => Int -> Fold m a (Maybe a) Source #

The maximum element in a rolling window.

See the performance related comments in minimum.

If you want to compute the maximum of the entire stream maximum would be much faster.

Time: \(\mathcal{O}(n*w)\) where \(w\) is the window size.

range :: (MonadIO m, Storable a, Ord a) => Int -> Fold m a (Maybe (a, a)) Source #

Determine the maximum and minimum in a rolling window.

If you want to compute the range of the entire stream Fold.teeWith (,) Fold.maximum Fold.minimum would be much faster.

Space: \(\mathcal{O}(n)\) where n is the window size.

Time: \(\mathcal{O}(n*w)\) where \(w\) is the window size.

mean :: forall m a. (Monad m, Fractional a) => Fold m (a, Maybe a) a Source #

Arithmetic mean of elements in a sliding window:

\(\mu = \frac{\sum_{i=1}^n x_{i}}{n}\)

This is also known as the Simple Moving Average (SMA) when used in the sliding window and Cumulative Moving Avergae (CMA) when used on the entire stream.

>>> mean = Fold.teeWith (/) sum length

Space: \(\mathcal{O}(1)\)

Time: \(\mathcal{O}(n)\)