order-statistics-0.1: L-Estimators for robust statistics

PortabilityHaskell 2011 + TypeFamilies
Stabilityprovisional
MaintainerEdward Kmett <ekmett@gmail.com>
Safe HaskellSafe-Infered

Statistics.Order

Contents

Description

 

Synopsis

L-Estimator

newtype L r Source

L-estimators are linear combinations of order statistics used by robust statistics.

Constructors

L 

Fields

runL :: Int -> IntMap r
 

Instances

Num r => VectorSpace (L r) 
Num r => AdditiveGroup (L r) 

Applying an L-estimator

(@@) :: (Num r, Ord r) => L r -> [r] -> rSource

Calculate the result of applying an L-estimator after sorting list into order statistics

(@!) :: Num r => L r -> Vector r -> rSource

Calculate the result of applying an L-estimator to a *pre-sorted* vector of order statistics

Analyzing an L-estimator

(@#) :: Num r => L r -> Int -> [r]Source

get a vector of the coefficients of an L estimator when applied to an input of a given length

breakdown :: (Num r, Eq r) => L r -> IntSource

calculate the breakdown % of an L-estimator

Robust L-Estimators

trimean :: Fractional r => L rSource

Tukey's trimean

 breakdown trimean = 25

midhinge :: Fractional r => L rSource

 midhinge = trimmed 0.25 midrange
 breakdown midhinge = 25%

iqr :: Fractional r => L rSource

interquartile range

 breakdown iqr = 25%
 iqr = trimmed 0.25 midrange

iqm :: Fractional r => L rSource

interquartile mean

 iqm = trimmed 0.25 mean

lscale :: Fractional r => L rSource

Direct estimator for the second L-moment given a sample

L-Estimator Combinators

trimmed :: Fractional r => Rational -> L r -> L rSource

Calculate a trimmed L-estimator. If the sample size isn't evenly divided, linear interpolation is used as described in http://en.wikipedia.org/wiki/Trimmed_mean#Interpolation

winsorized, winsorised :: Fractional r => Rational -> L r -> L rSource

Calculates an interpolated winsorized L-estimator in a manner analogous to the trimmed estimator. Unlike trimming, winsorizing replaces the extreme values.

jackknifed :: Fractional r => L r -> L rSource

Jackknifes the statistic by removing each sample in turn and recalculating the L-estimator, requires at least 2 samples!

Trivial L-Estimators

mean :: Fractional r => L rSource

The average of all of the order statistics. Not robust.

 breakdown mean = 0%

total :: Num r => L rSource

The sum of all of the order statistics. Not robust.

 breakdown total = 0%

lmin :: Num r => L rSource

The minimum value in the sample

lmax :: Num r => L rSource

The maximum value in the sample

midrange :: Fractional r => L rSource

 midrange = lmax - lmin
 breakdown midrange = 0%

Sample-size-dependent L-Estimators

nthSmallest :: Num r => Int -> L rSource

nthLargest :: Num r => Int -> L rSource

Quantiles

Common quantiles

quantile :: Fractional r => Rational -> L rSource

Compute a quantile with traditional direct averaging

median :: Fractional r => L rSource

The most robust L-estimator possible.

 breakdown median = 50

t1, t2 :: Fractional r => L rSource

terciles 1 and 2

 breakdown t1 = breakdown t2 = 33%

q1, q3, q2 :: Fractional r => L rSource

quantiles, with breakdown points 25%, 50%, and 25% respectively

qu1, qu4, qu3, qu2 :: Fractional r => L rSource

quintiles 1 through 4

percentile :: Fractional r => Rational -> L rSource

 breakdown (percentile n) = min n (100 - n)

Harrell-Davis Quantile Estimator

hdquantile :: Fractional r => Rational -> L rSource

The Harrell-Davis quantile estimate. Uses multiple order statistics to approximate the quantile to reduce variance.

Compute a quantile using a specified quantile estimation strategy

quantileBy :: Num r => Estimator r -> Rational -> L rSource

Compute a quantile using the given estimation strategy to interpolate when an exact quantile isn't available

Sample Quantile Estimators

data Estimate r Source

Sample quantile estimators

Constructors

Estimate !Rational (IntMap r) 

r1 :: Num r => Estimator rSource

Inverse of the empirical distribution function

r2 :: Fractional r => Estimator rSource

.. with averaging at discontinuities

r3 :: Num r => Estimator rSource

The observation numbered closest to Np. NB: does not yield a proper median

r4 :: Fractional r => Estimator rSource

Linear interpolation of the empirical distribution function. NB: does not yield a proper median.

r5 :: Fractional r => Estimator rSource

.. with knots midway through the steps as used in hydrology. This is the simplest continuous estimator that yields a correct median

r6 :: Fractional r => Estimator rSource

Linear interpolation of the expectations of the order statistics for the uniform distribution on [0,1]

r7 :: Fractional r => Estimator rSource

Linear interpolation of the modes for the order statistics for the uniform distribution on [0,1]

r8 :: Fractional r => Estimator rSource

Linear interpolation of the approximate medans for order statistics.

r9 :: Fractional r => Estimator rSource

The resulting quantile estimates are approximately unbiased for the expected order statistics if x is normally distributed.

r10 :: Fractional r => Estimator rSource

When rounding h, this yields the order statistic with the least expected square deviation relative to p.