| Portability | portable | 
|---|---|
| Stability | experimental | 
| Maintainer | bos@serpentine.com | 
| Safe Haskell | None | 
Statistics.Sample.KernelDensity.Simple
Contents
Description
Deprecated: Use Statistics.Sample.KernelDensity instead.
Kernel density estimation code, providing non-parametric ways to estimate the probability density function of a sample.
The techniques used by functions in this module are relatively
 fast, but they generally give inferior results to the KDE function
 in the main KernelDensity module (due to the
 oversmoothing documented for bandwidth below).
- epanechnikovPDF :: Vector v Double => Int -> v Double -> (Points, Vector Double)
- gaussianPDF :: Vector v Double => Int -> v Double -> (Points, Vector Double)
- newtype Points = Points {}
- choosePoints :: Vector v Double => Int -> Double -> v Double -> Points
- type Bandwidth = Double
- bandwidth :: Vector v Double => (Double -> Bandwidth) -> v Double -> Bandwidth
- epanechnikovBW :: Double -> Bandwidth
- gaussianBW :: Double -> Bandwidth
- type Kernel = Double -> Double -> Double -> Double -> Double
- epanechnikovKernel :: Kernel
- gaussianKernel :: Kernel
- estimatePDF :: Vector v Double => Kernel -> Bandwidth -> v Double -> Points -> Vector Double
- simplePDF :: Vector v Double => (Double -> Double) -> Kernel -> Double -> Int -> v Double -> (Points, Vector Double)
Simple entry points
Arguments
| :: Vector v Double | |
| => Int | Number of points at which to estimate | 
| -> v Double | Data sample | 
| -> (Points, Vector Double) | 
Simple Epanechnikov kernel density estimator. Returns the uniformly spaced points from the sample range at which the density function was estimated, and the estimates at those points.
Arguments
| :: Vector v Double | |
| => Int | Number of points at which to estimate | 
| -> v Double | Data sample | 
| -> (Points, Vector Double) | 
Simple Gaussian kernel density estimator. Returns the uniformly spaced points from the sample range at which the density function was estimated, and the estimates at those points.
Building blocks
Choosing points from a sample
Points from the range of a Sample.
Constructors
| Points | |
| Fields | |
Arguments
| :: Vector v Double | |
| => Int | Number of points to select, n | 
| -> Double | Sample bandwidth, h | 
| -> v Double | Input data | 
| -> Points | 
Choose a uniform range of points at which to estimate a sample's probability density function.
If you are using a Gaussian kernel, multiply the sample's bandwidth by 3 before passing it to this function.
If this function is passed an empty vector, it returns values of positive and negative infinity.
Bandwidth estimation
bandwidth :: Vector v Double => (Double -> Bandwidth) -> v Double -> BandwidthSource
Compute the optimal bandwidth from the observed data for the given kernel.
This function uses an estimate based on the standard deviation of a sample (due to Deheuvels), which performs reasonably well for unimodal distributions but leads to oversmoothing for more complex ones.
epanechnikovBW :: Double -> BandwidthSource
Bandwidth estimator for an Epanechnikov kernel.
gaussianBW :: Double -> BandwidthSource
Bandwidth estimator for a Gaussian kernel.
Kernels
type Kernel = Double -> Double -> Double -> Double -> DoubleSource
The convolution kernel. Its parameters are as follows:
- Scaling factor, 1/nh
- Bandwidth, h
- A point at which to sample the input, p
- One sample value, v
epanechnikovKernel :: KernelSource
Epanechnikov kernel for probability density function estimation.
gaussianKernel :: KernelSource
Gaussian kernel for probability density function estimation.
Low-level estimation
Arguments
| :: Vector v Double | |
| => Kernel | Kernel function | 
| -> Bandwidth | Bandwidth, h | 
| -> v Double | Sample data | 
| -> Points | Points at which to estimate | 
| -> Vector Double | 
Kernel density estimator, providing a non-parametric way of estimating the PDF of a random variable.
Arguments
| :: Vector v Double | |
| => (Double -> Double) | Bandwidth function | 
| -> Kernel | Kernel function | 
| -> Double | Bandwidth scaling factor (3 for a Gaussian kernel, 1 for all others) | 
| -> Int | Number of points at which to estimate | 
| -> v Double | sample data | 
| -> (Points, Vector Double) | 
A helper for creating a simple kernel density estimation function with automatically chosen bandwidth and estimation points.
References
- Deheuvels, P. (1977) Estimation non paramétrique de la densité par histogrammes généralisés. Mhttp:archive.numdam.orgarticleRSA_1977__25_3_5_0.pdf>