| Safe Haskell | None |
|---|---|
| Language | Haskell2010 |
Data.VPTree.Build
Contents
Documentation
Arguments
| :: (RealFrac p, Floating d, Ord d, Eq a) | |
| => (a -> a -> d) | distance function |
| -> p | proportion of remaining dataset to sample at each level, \(0 < p <= 1 \) |
| -> Vector a | dataset used for constructing the index |
| -> VPTree d a |
Build a VPTree
The supplied distance function d must satisfy the definition of a metric, i.e.
- identity of indiscernible elements : \( d(x, y) = 0 \leftrightarrow x \equiv y \)
- symmetry : \( d(x, y) = d(y, x) \)
- triangle inequality : \( d(x, y) + d(y, z) >= d(x, z) \)
The current implementation makes multiple passes over the whole dataset, which is why the entire indexing dataset must be present in memory (packed as a Vector).
Implementation detail : construction of a VP-tree requires a randomized algorithm, but we run that in the ST monad so the result is pure.