Portability | portable |
---|---|
Stability | experimental |
Maintainer | bos@serpentine.com |
Kernel density estimation code, providing non-parametric ways to estimate the probability density function of a sample.
- epanechnikovPDF :: Vector v Double => Int -> v Double -> (Points, Vector Double)
- gaussianPDF :: Vector v Double => Int -> v Double -> (Points, Vector Double)
- newtype Points = Points {}
- choosePoints :: Vector v Double => Int -> Double -> v Double -> Points
- type Bandwidth = Double
- bandwidth :: Vector v Double => (Double -> Bandwidth) -> v Double -> Bandwidth
- epanechnikovBW :: Double -> Bandwidth
- gaussianBW :: Double -> Bandwidth
- type Kernel = Double -> Double -> Double -> Double -> Double
- epanechnikovKernel :: Kernel
- gaussianKernel :: Kernel
- estimatePDF :: Vector v Double => Kernel -> Bandwidth -> v Double -> Points -> Vector Double
- simplePDF :: Vector v Double => (Double -> Double) -> Kernel -> Double -> Int -> v Double -> (Points, Vector Double)
Simple entry points
:: Vector v Double | |
=> Int | Number of points at which to estimate |
-> v Double | Data sample |
-> (Points, Vector Double) |
Simple Epanechnikov kernel density estimator. Returns the uniformly spaced points from the sample range at which the density function was estimated, and the estimates at those points.
:: Vector v Double | |
=> Int | Number of points at which to estimate |
-> v Double | Data sample |
-> (Points, Vector Double) |
Simple Gaussian kernel density estimator. Returns the uniformly spaced points from the sample range at which the density function was estimated, and the estimates at those points.
Building blocks
Choosing points from a sample
Points from the range of a Sample
.
:: Vector v Double | |
=> Int | Number of points to select, n |
-> Double | Sample bandwidth, h |
-> v Double | Input data |
-> Points |
Choose a uniform range of points at which to estimate a sample's probability density function.
If you are using a Gaussian kernel, multiply the sample's bandwidth by 3 before passing it to this function.
If this function is passed an empty vector, it returns values of positive and negative infinity.
Bandwidth estimation
bandwidth :: Vector v Double => (Double -> Bandwidth) -> v Double -> BandwidthSource
Compute the optimal bandwidth from the observed data for the given kernel.
epanechnikovBW :: Double -> BandwidthSource
Bandwidth estimator for an Epanechnikov kernel.
gaussianBW :: Double -> BandwidthSource
Bandwidth estimator for a Gaussian kernel.
Kernels
type Kernel = Double -> Double -> Double -> Double -> DoubleSource
The convolution kernel. Its parameters are as follows:
- Scaling factor, 1/nh
- Bandwidth, h
- A point at which to sample the input, p
- One sample value, v
epanechnikovKernel :: KernelSource
Epanechnikov kernel for probability density function estimation.
gaussianKernel :: KernelSource
Gaussian kernel for probability density function estimation.
Low-level estimation
:: Vector v Double | |
=> Kernel | Kernel function |
-> Bandwidth | Bandwidth, h |
-> v Double | Sample data |
-> Points | Points at which to estimate |
-> Vector Double |
Kernel density estimator, providing a non-parametric way of estimating the PDF of a random variable.
:: Vector v Double | |
=> (Double -> Double) | Bandwidth function |
-> Kernel | Kernel function |
-> Double | Bandwidth scaling factor (3 for a Gaussian kernel, 1 for all others) |
-> Int | Number of points at which to estimate |
-> v Double | sample data |
-> (Points, Vector Double) |
A helper for creating a simple kernel density estimation function with automatically chosen bandwidth and estimation points.