h&762      !"#$%&'()*+,-./01!(c) 2020 Composewell Technologies Apache-2.0streamly@composewell.com experimentalGHC Safe-Inferred %&(/2689:;6l-2streamly-statistics0Test if the given integer value is a power of 2.3streamly-statisticsCreate a power of 27Argument must be less than 64 assuming 64-bit Int size.4streamly-statistics8Create a bit mask with lower n bits 0 and the rest as 1.7Argument must be less than 64 assuming 64-bit Int size.5streamly-statistics0Compute the base 2 logarithm of the given value."Assumes the Int size to be 64-bit. streamly-statistics.Compute fast fourier transform of an array of 6 values. Array length must be power of 2. streamly-statistics(The minimum element in a rolling window.For smaller window sizes (< 30) Streamly.Data.Fold.Window.minimum performs better. If you want to compute the minimum of the entire stream Fold.min from streamly package would be much faster.Time: \mathcal{O}(n*w) where w is the window size. streamly-statistics(The maximum element in a rolling window.For smaller window sizes (< 30) Streamly.Data.Fold.Window.maximum performs better. If you want to compute the maximum of the entire stream Streamly.Data.Fold.maximum from streamly package would be much faster.Time: \mathcal{O}(n*w) where w is the window size. streamly-statistics0Arithmetic mean of elements in a sliding window:"\mu = \frac{\sum_{i=1}^n x_{i}}{n}This is also known as the Simple Moving Average (SMA) when used in the sliding window and Cumulative Moving Avergae (CMA) when used on the entire stream.)Mean is the same as the first raw moment. \mu = \mu'_1mean = rawMoment 1mean = powerMean 1"mean = Fold.teeWith (/) sum lengthSpace: \mathcal{O}(1)Time: \mathcal{O}(n)7streamly-statisticsRecompute mean from old mean when an item is removed from the sample.8streamly-statisticsRecompute mean from old mean when an item is added to the sample.9streamly-statisticsRecompute mean from old mean when an item in the sample is replaced.streamly-statisticsSame as   but uses Welford's algorithm to compute the mean incrementally.It maintains a running mean instead of a running sum and adjusts the mean based on a new value. This is slower than   because of using the division operation on each step and it is numerically unstable (as of now). The advantage over   could be no overflow if the numbers are large, because we do not maintain a sum, but that is a highly unlikely corner case.Internalstreamly-statistics&Raw moment is the moment about 0. The kth raw moment is defined as:'\mu'_k = \frac{\sum_{i=1}^n x_{i}^k}{n}2rawMoment k = Fold.teeWith (/) (powerSum p) lengthSee  2https://en.wikipedia.org/wiki/Moment_(mathematics) .Space: \mathcal{O}(1)Time: \mathcal{O}(n)streamly-statisticsLike  but powers can be negative or fractional. This is slower than  for positive intergal powers.:rawMomentFrac p = Fold.teeWith (/) (powerSumFrac p) lengthstreamly-statisticsThe kth power mean of numbers x_1, x_2, \ldots, x_n is:M_k = \left( \frac{1}{n} \sum_{i=1}^n x_i^k \right)^{\frac{1}{k}})powerMean(k) = (rawMoment(k))^\frac{1}{k}7powerMean k = (** (1 / fromIntegral k)) <$> rawMoment kAll other means can be expressed in terms of power mean. It is also known as the generalized mean.See .https://en.wikipedia.org/wiki/Generalized_meanstreamly-statisticsLike  but powers can be negative or fractional. This is slower than  for positive intergal powers.2powerMeanFrac k = (** (1 / k)) <$> rawMomentFrac kstreamly-statistics*The harmonic mean of the positive numbers x_1, x_2, \ldots, x_n is defined as:?HM = \frac{n}{\frac1{x_1} + \frac1{x_2} + \cdots + \frac1{x_n}}=HM = \left(\frac{\sum\limits_{i=1}^n x_i^{-1}}{n}\right)^{-1}7harmonicMean = Fold.teeWith (/) length (lmap recip sum)!harmonicMean = powerMeanFrac (-1)See  +https://en.wikipedia.org/wiki/Harmonic_mean .streamly-statisticsGeometric mean, defined as:!GM = \sqrt[n]{x_1 x_2 \cdots x_n}/GM = \left(\prod_{i=1}^n x_i\right)^\frac{1}{n}6or, equivalently, as the arithmetic mean in log space:+GM = e ^{{\frac{\sum_{i=1}^{n}\ln a_i}{n}}}%geometricMean = exp <$> lmap log meanSee  ,https://en.wikipedia.org/wiki/Geometric_mean .streamly-statistics=The quadratic mean or root mean square (rms) of the numbers x_1, x_2, \ldots, x_n is defined as:RMS = \sqrt{ \frac{1}{n} \left( x_1^2 + x_2^2 + \cdots + x_n^2 \right) }.quadraticMean = powerMean 2See  .https://en.wikipedia.org/wiki/Root_mean_square .:streamly-statistics-ewmaStep smoothing-factor old-value new-valuestreamly-statisticsewma smoothingFactor.ewma of an empty stream is 0.%Exponential weighted moving average, s_n, of n values, x_1,\ldots,x_n, is defined recursively as:\begin{align} s_0& = x_0\\ s_n & = \alpha x_{n} + (1-\alpha)s_{n-1},\quad n>0 \end{align}If we expand the recursive term it becomes an exponential series:s_n = \alpha \left[x_n + (1-\alpha)x_{n-1} + (1-\alpha)^2 x_{n-2} + \cdots + (1-\alpha)^{n-1} x_1 \right] + (1-\alpha)^n x_0where \alpha(, the smoothing factor, is in the range  0 <\alpha < 1. More the value of \alpha, the more weight is given to newer values. As a special case if it is 0 then the weighted sum would always be the same as the oldest value, if it is 1 then the sum would always be the same as the newest value.See ,https://en.wikipedia.org/wiki/Moving_averageSee 3https://en.wikipedia.org/wiki/Exponential_smoothingstreamly-statisticsewma n k is like  but uses the mean of the first n9 values and then uses that as the initial value for the ewma of the rest of the values.This can be used to reduce the effect of volatility of the initial value when k is too small.streamly-statisticsewma n k is like  but uses 1 as the initial smoothing factor and then exponentially smooths it to k using n as the smoothing factor."This is significantly faster than .streamly-statisticsThe difference between the maximum and minimum elements of a rolling window.(range = Fold.teeWith (-) maximum minimum6If you want to compute the range of the entire stream +Fold.teeWith (-) Fold.maximum Fold.minimum0 from the streamly package would be much faster.Space: \mathcal{O}(n) where n is the window size.Time: \mathcal{O}(n*w) where w is the window size.streamly-statisticsmd n computes the mean absolute deviation (or mean deviation) in a sliding window of last n elements in the stream.+The mean absolute deviation of the numbers x_1, x_2, \ldots, x_n is:&MD = \frac{1}{n}\sum_{i=1}^n |x_i-\mu|Note: It is expensive to compute MD in a sliding window. We need to maintain a ring buffer of last n elements and maintain a running mean, when the result is extracted we need to compute the difference of all elements from the mean and get the average. Using standard deviation may be computationally cheaper.See  8https://en.wikipedia.org/wiki/Average_absolute_deviation . Pre-releasestreamly-statistics The variance \sigma^2 of a population of n equally likely values is defined as the average of the squares of deviations from the mean \mu/. In other words, second moment about the mean:2\sigma^2 = \frac{1}{n}\sum_{i=1}^n {(x_{i}-\mu)}^2\sigma^2 = rawMoment(2) - \mu^2\mu_2 = -(\mu'_1)^2 + \mu'_2Note that the variance would be biased if applied to estimate the population variance from a sample of the population. See .See  &https://en.wikipedia.org/wiki/Variance.Space: \mathcal{O}(1)Time: \mathcal{O}(n)streamly-statisticsStandard deviation \sigma is the square root of .This is the population standard deviation or uncorrected sample standard deviation.stdDev = sqrt <$> varianceSee  0https://en.wikipedia.org/wiki/Standard_deviation .Space: \mathcal{O}(1)Time: \mathcal{O}(n)streamly-statistics Skewness \gamma5 is the standardized third central moment defined as:&\tilde{\mu}_3 = \frac{\mu_3}{\sigma^3}The third central moment can be computed in terms of raw moments:,\mu_3 = 2(\mu'_1)^3 - 3\mu'_1\mu'_2 + \mu'_3 Substituting  \mu'_1 = \mu, and \mu'_2 = \mu^2 + \sigma^2:&\mu_3 = -\mu^3 - 3\mu\sigma^2 + \mu'_3Skewness is a measure of symmetry of the probability distribution. It is 0 for a symmetric distribution, negative for a distribution that is skewed towards left, positive for a distribution skewed towards right.?For a normal like distribution the median can be found around \mu - \frac{\gamma\sigma}{6}# and the mode can be found around \mu - \frac{\gamma \sigma}{2}.See  &https://en.wikipedia.org/wiki/Skewness .streamly-statistics Kurtosis \kappa7 is the standardized fourth central moment, defined as:&\tilde{\mu}_4 = \frac{\mu_4}{\sigma^4}The fourth central moment can be computed in terms of raw moments:\mu_4 = -3(\mu'_1)^4 + 6(\mu'_1)^2\mu'_2 - 4\mu'_1\mu'_3\ + \mu'_4 Substituting  \mu'_1 = \mu, and \mu'_2 = \mu^2 + \sigma^2:5\mu_4 = 3\mu^4 + 6\mu^2\sigma^2 - 4\mu\mu'_3 + \mu'_4It is always non-negative. It is 0 for a point distribution, low for light tailed (platykurtic) distributions and high for heavy tailed (leptokurtic) distributions.\kappa >= \gamma^2 + 1For a normal distribution \kappa = 3\sigma^4.See  &https://en.wikipedia.org/wiki/Kurtosis .streamly-statisticsUnbiased sample variance i.e. the variance of a sample corrected to better estimate the variance of the population, defined as:1s^2 = \frac{1}{n - 1}\sum_{i=1}^n {(x_{i}-\mu)}^2%s^2 = \frac{n}{n - 1} \times \sigma^2.See  3https://en.wikipedia.org/wiki/Bessel%27s_correction. streamly-statisticsSample standard deviation:s = \sqrt{sampleVariance}&sampleStdDev = sqrt <$> sampleVarianceSee  https://en.wikipedia.org/wiki/Unbiased_estimation_of_standard_deviation .!streamly-statistics4Standard error of the sample mean (SEM), defined as:% SEM = \frac{sampleStdDev}{\sqrt{n}} See  ,https://en.wikipedia.org/wiki/Standard_error .Space: \mathcal{O}(1)Time: \mathcal{O}(n)"streamly-statisticsGiven an array of n items, compute mean of (n - 1) items at a time, producing a stream of all possible mean values omitting a different item every time.#streamly-statisticsGiven an array of n items, compute variance of (n - 1) items at a time, producing a stream of all possible variance values omitting a different item every time.$streamly-statistics!Standard deviation computed from #.%streamly-statisticsRandomly select elements from an array, with replacement, producing a stream of the same size as the original array.&streamly-statisticsResample an array multiple times and run the supplied fold on each resampled stream, producing a stream of fold results. The fold is usually an estimator fold.'streamly-statistics4Count the frequency of elements in a sliding window.(input = Stream.fromList [1,1,3,4,4::Int]-f = Ring.slidingWindow 4 Statistics.frequencyStream.fold f inputfromList [(1,1),(3,1),(4,2)](streamly-statistics6Determine the frequency of each element in the stream.)streamly-statisticsFind out the most frequently ocurring element in the stream and its frequency.*streamly-statistics"binOffsetSize offset binSize input. Given an integral input value, return its bin index provided that each bin contains binSize items and the bins are aligned such that the 0 index bin starts at offset from 0. If offset = 0 then the bin with index 0 would have values from 0 to binSize - 1.This API does not put a bound on the number of bins, therefore, the number of bins could be potentially large depending on the range of values.+streamly-statistics$binFromSizeN low binSize nbins input . Classify input into bins specified by a low limit, binSize and nbins4. Inputs below the lower limit are classified into 7 and inputs above the highest bin are classified into  . < inputs are classified into bins starting from bin index 0.,streamly-statisticsbinFromToN low high nbins input. Like  binFromSizeN> except that a range of lower and higher limit is specified. binSize" is computed using the range and nbins. nbins is rounded to the range 0 < nbins < (high - low + 1).  high >= low must hold.-streamly-statisticsClassify an input value to bins using the bin boundaries specified in an array. Unimplemented.streamly-statisticsGiven a bin classifier function and a stream of values, generate a histogram map from indices of bins to the number of items in the bin.Stream.fold (histogram (binOffsetSize 0 3)) $ Stream.fromList [1..15].fromList [(0,2),(1,3),(2,3),(3,3),(4,3),(5,1)]&streamly-statisticsNumber of resamples to compute.streamly-statisticsOriginal sample.streamly-statisticsEstimator fold/  !"#$%&'()*+,-./   !%&"#$'() *+,-. ;      !"#$%&'()*+,-./0123456789:;<=>?@0streamly-statistics-0.1.0-J0rarq8i9HcIpdW5XbNCDTStreamly.Statistics*streamly-core-0.1.0-44m3GA0JDl468cZG2M3Pu6"Streamly.Internal.Data.Fold.WindowpowerSumlengthsumsumInt cumulativelmapHistBin BelowRangeInRange AboveRangefftminimummaximummean welfordMean rawMoment rawMomentFrac powerMean powerMeanFrac harmonicMean geometricMean quadraticMeanewma ewmaAfterMeanewmaRampUpSmoothingrangemdvariancestdDevskewnesskurtosissampleVariance sampleStdDev stdErrMean jackKnifeMeanjackKnifeVariancejackKnifeStdDevresample foldResamples frequency frequency'mode binOffsetSize binFromSizeN binFromToN binBoundaries histogram $fOrdHistBin $fEqHistBin $fShowHistBinisPower2_power2 maskLowerNlogBase2base Data.ComplexComplex _meanSubtractmeanAdd meanReplaceewmaStep