statistics-0.2.1: A library of statistical types, data, and functions

Portability portable experimental bos@serpentine.com

Statistics.Sample

Description

Commonly used sample statistics, also known as descriptive statistics.

Synopsis

Sample data.

# Statistics of location

Arithmetic mean. This uses Welford's algorithm to provide numerical stability, using a single pass over the sample data.

Harmonic mean. This algorithm performs a single pass over the sample.

Geometric mean of a sample containing no negative values.

# Statistics of dispersion

The variance—and hence the standard deviation—of a sample of fewer than two elements are both defined to be zero.

## Two-pass functions (numerically robust)

These functions use the compensated summation algorithm of Chan et al. for numerical robustness, but require two passes over the sample data as a result.

Because of the need for two passes, these functions are not subject to stream fusion.

Maximum likelihood estimate of a sample's variance.

Unbiased estimate of a sample's variance.

Standard deviation. This is simply the square root of the maximum likelihood estimate of the variance.

## Single-pass functions (faster, less safe)

The functions prefixed with the name `fast` below perform a single pass over the sample data using Knuth's algorithm. They usually work well, but see below for caveats. These functions are subject to array fusion.

Note: in cases where most sample data is close to the sample's mean, Knuth's algorithm gives inaccurate results due to catastrophic cancellation.

Maximum likelihood estimate of a sample's variance.

Unbiased estimate of a sample's variance.

Standard deviation. This is simply the square root of the maximum likelihood estimate of the variance.