statistics-0.5.0.0: A library of statistical types, data, and functions

Portabilityportable
Stabilityexperimental
Maintainerbos@serpentine.com

Statistics.KernelDensity

Contents

Description

Kernel density estimation code, providing non-parametric ways to estimate the probability density function of a sample.

Synopsis

Simple entry points

epanechnikovPDFSource

Arguments

:: Int

Number of points at which to estimate

-> Sample 
-> (Points, Vector Double) 

Simple Epanechnikov kernel density estimator. Returns the uniformly spaced points from the sample range at which the density function was estimated, and the estimates at those points.

gaussianPDFSource

Arguments

:: Int

Number of points at which to estimate

-> Sample 
-> (Points, Vector Double) 

Simple Gaussian kernel density estimator. Returns the uniformly spaced points from the sample range at which the density function was estimated, and the estimates at those points.

Building blocks

Choosing points from a sample

newtype Points Source

Points from the range of a Sample.

Constructors

Points 

Instances

choosePointsSource

Arguments

:: Int

Number of points to select, n

-> Double

Sample bandwidth, h

-> Sample

Input data

-> Points 

Choose a uniform range of points at which to estimate a sample's probability density function.

If you are using a Gaussian kernel, multiply the sample's bandwidth by 3 before passing it to this function.

If this function is passed an empty vector, it returns values of positive and negative infinity.

Bandwidth estimation

type Bandwidth = DoubleSource

The width of the convolution kernel used.

bandwidth :: (Double -> Bandwidth) -> Sample -> BandwidthSource

Compute the optimal bandwidth from the observed data for the given kernel.

epanechnikovBW :: Double -> BandwidthSource

Bandwidth estimator for an Epanechnikov kernel.

gaussianBW :: Double -> BandwidthSource

Bandwidth estimator for a Gaussian kernel.

Kernels

type Kernel = Double -> Double -> Double -> Double -> DoubleSource

The convolution kernel. Its parameters are as follows:

  • Scaling factor, 1/nh
  • Bandwidth, h
  • A point at which to sample the input, p
  • One sample value, v

epanechnikovKernel :: KernelSource

Epanechnikov kernel for probability density function estimation.

gaussianKernel :: KernelSource

Gaussian kernel for probability density function estimation.

Low-level estimation

estimatePDFSource

Arguments

:: Kernel

Kernel function

-> Bandwidth

Bandwidth, h

-> Sample

Sample data

-> Points

Points at which to estimate

-> Vector Double 

Kernel density estimator, providing a non-parametric way of estimating the PDF of a random variable.

simplePDFSource

Arguments

:: (Double -> Double)

Bandwidth function

-> Kernel

Kernel function

-> Double

Bandwidth scaling factor (3 for a Gaussian kernel, 1 for all others)

-> Int

Number of points at which to estimate

-> Sample

Sample data

-> (Points, Vector Double) 

A helper for creating a simple kernel density estimation function with automatically chosen bandwidth and estimation points.