Safe Haskell | None |
---|

An implementation of `Search`

based on a ternary search tree
(`TSTSet`

): https://en.wikipedia.org/wiki/Ternary_search_tree.

The searches are performed by manually generating the close word by deleting, transposing, or adding wildcards to match additional characters.

This makes this structure fast for small distances with a small number of
generated candidates, but impractical for even slightly larger distances -
in my tests `BK`

outpeforms this module when the
distance is greater than 2.

The data structure has no knowledge of the distance and thus it does not
need to be rebuilt if different edit distances are needed. However this
means that it cannot work with arbitrary `EditDistance`

instances are
functions need to be defined manually to generate the candidates. In this
case `levenshtein`

uses `deletions`

, `replaces`

, and `insertions`

to
generate the candidates; while `damerauLevenshtein`

also uses
`transpositions`

.

- data TSTSet sym
- empty :: Ord sym => TSTSet sym
- insert :: (Ord sym, ListLike full sym) => full -> TSTSet sym -> TSTSet sym
- levenshtein :: (Ord sym, ListLike full sym) => Int -> full -> TSTSet sym -> [(full, Distance Levenshtein)]
- damerauLevenshtein :: (Ord sym, ListLike full sym) => Int -> full -> TSTSet sym -> [(full, Distance DamerauLevenshtein)]
- data WildCard a
- type WildList a = [WildCard a]
- deletions :: [a] -> [[a]]
- transpositions :: [a] -> [[a]]
- replaces :: WildList a -> [WildList a]
- insertions :: WildList a -> [WildList a]

# Type

# Operations

levenshtein :: (Ord sym, ListLike full sym) => Int -> full -> TSTSet sym -> [(full, Distance Levenshtein)]Source

damerauLevenshtein :: (Ord sym, ListLike full sym) => Int -> full -> TSTSet sym -> [(full, Distance DamerauLevenshtein)]Source

# Candidates generation

The following functions generate candidates distant 1 from the given word, and they are reexported for completeness.

`deletions`

and `transpositions`

do not need to wildcard the word, while
`replaces`

and `insertions`

do since we are adding characters.

data WildCard a

## Operations

transpositions :: [a] -> [[a]]Source

insertions :: WildList a -> [WildList a]Source