Portability | portable |
---|---|

Stability | Alpha quality. Interface may change without notice. |

Maintainer | josef.svenningsson@gmail.com |

Burhard-Keller trees provide an implementation of sets which apart
from the ordinary operations also has an approximate member search,
allowing you to search for elements that are of a distance `n`

from
the element you are searching for. The distance is determined using
a metric on the type of elements. Therefore all elements must
implement the `Metric`

type class, rather than the more usual
`Ord`

.

Useful metrics include the manhattan distance between two points, the Levenshtein edit distance between two strings, the number of edges in the shortest path between two nodes in a undirected graph and the Hamming distance between two binary strings. Any euclidean space also has a metric. However, in this module we use int-valued metrics and that doesn't quite with the metrics of euclidean spaces which are real-values.

The worst case complexity of many of these operations is quite bad,
but the expected behavior varies greatly with the metric. For
example, the discrete metric (```
distance x y | y == x = 0 |
otherwise = 1
```

) makes BK-trees behave abysmally. The metrics
mentioned above should give good performance characteristics.

- data BKTree a
- class Eq a => Metric a where
- null :: BKTree a -> Bool
- empty :: BKTree a
- fromList :: Metric a => [a] -> BKTree a
- singleton :: a -> BKTree a
- insert :: Metric a => a -> BKTree a -> BKTree a
- member :: Metric a => a -> BKTree a -> Bool
- memberDistance :: Metric a => Int -> a -> BKTree a -> Bool
- delete :: Metric a => a -> BKTree a -> BKTree a
- union :: Metric a => BKTree a -> BKTree a -> BKTree a
- unions :: Metric a => [BKTree a] -> BKTree a
- elems :: BKTree a -> [a]
- elemsDistance :: Metric a => Int -> a -> BKTree a -> [a]
- closest :: Metric a => a -> BKTree a -> Maybe (a, Int)

# Documentation

class Eq a => Metric a whereSource

A type is `Metric`

if is has a function `distance`

which has the following
properties:

`distance`

x y >= 0-

if and only if`distance`

x y == 0`x == y`

`distance`

x y ==`distance`

y x`distance`

x z <=`distance`

x y +`distance`

y z

All types of elements to `BKTree`

must implement `Metric`

.

This definition of a metric deviates from the mathematical one in that it returns an integer instead of a real number. The reason for choosing integers is that I wanted to avoid the rather unpredictable rounding of floating point numbers.

insert :: Metric a => a -> BKTree a -> BKTree aSource

Inserts an element into the tree. If an element is inserted several times it will be stored several times.

memberDistance :: Metric a => Int -> a -> BKTree a -> BoolSource

Approximate searching.

will return true if
there is an element in `memberDistance`

n a tree`tree`

which has a `distance`

less than or equal to
`n`

from `a`

.

delete :: Metric a => a -> BKTree a -> BKTree aSource

Removes an element from the tree. If an element occurs several times in the only the first occurrence will be deleted.

elemsDistance :: Metric a => Int -> a -> BKTree a -> [a]Source

returns all the elements in `elemsDistance`

n a tree`tree`

which are
at a `distance`

less than or equal to `n`

from the element `a`

.