Portability | portable |
---|---|
Stability | Alpha quality. Interface may change without notice. |
Maintainer | josef.svenningsson@gmail.com |
Burhard-Keller trees provide an implementation of sets which apart
from the ordinary operations also has an approximate member search,
allowing you to search for elements that are of a distance n
from
the element you are searching for. The distance is determined using
a metric on the type of elements. Therefore all elements must
implement the Metric
type class, rather than the more usual
Ord
.
Useful metrics include the manhattan distance between two points, the Levenshtein edit distance between two strings, the number of edges in the shortest path between two nodes in a undirected graph and the Hamming distance between two binary strings. Any euclidean space also has a metric. However, in this module we use int-valued metrics and that doesn't quite with the metrics of euclidean spaces which are real-values.
The worst case complexity of many of these operations is quite bad,
but the expected behavior varies greatly with the metric. For
example, the discrete metric (distance x y | y == x = 0 |
otherwise = 1
) makes BK-trees behave abysmally. The metrics
mentioned above should give good performance characteristics.
- data BKTree a
- class Eq a => Metric a where
- null :: BKTree a -> Bool
- empty :: BKTree a
- fromList :: Metric a => [a] -> BKTree a
- singleton :: a -> BKTree a
- insert :: Metric a => a -> BKTree a -> BKTree a
- member :: Metric a => a -> BKTree a -> Bool
- memberDistance :: Metric a => Int -> a -> BKTree a -> Bool
- delete :: Metric a => a -> BKTree a -> BKTree a
- union :: Metric a => BKTree a -> BKTree a -> BKTree a
- unions :: Metric a => [BKTree a] -> BKTree a
- elems :: BKTree a -> [a]
- elemsDistance :: Metric a => Int -> a -> BKTree a -> [a]
- closest :: Metric a => a -> BKTree a -> Maybe (a, Int)
Documentation
class Eq a => Metric a whereSource
A type is Metric
if is has a function distance
which has the following
properties:
distance
x y >= 0-
if and only ifdistance
x y == 0x == y
distance
x y ==distance
y xdistance
x z <=distance
x y +distance
y z
All types of elements to BKTree
must implement Metric
.
This definition of a metric deviates from the mathematical one in that it returns an integer instead of a real number. The reason for choosing integers is that I wanted to avoid the rather unpredictable rounding of floating point numbers.
insert :: Metric a => a -> BKTree a -> BKTree aSource
Inserts an element into the tree. If an element is inserted several times it will be stored several times.
memberDistance :: Metric a => Int -> a -> BKTree a -> BoolSource
Approximate searching.
will return true if
there is an element in memberDistance
n a treetree
which has a distance
less than or equal to
n
from a
.
delete :: Metric a => a -> BKTree a -> BKTree aSource
Removes an element from the tree. If an element occurs several times in the only the first occurrence will be deleted.
elemsDistance :: Metric a => Int -> a -> BKTree a -> [a]Source
returns all the elements in elemsDistance
n a treetree
which are
at a distance
less than or equal to n
from the element a
.