language-spelling-0.2: Various tools to detect/correct mistakes in words

Safe HaskellNone

Language.Distance.Search.TST

Contents

Description

An implementation of Search based on a ternary search tree (TSTSet): https://en.wikipedia.org/wiki/Ternary_search_tree.

The searches are performed by manually generating the close word by deleting, transposing, or adding wildcards to match additional characters.

This makes this structure fast for small distances with a small number of generated candidates, but impractical for even slightly larger distances - in my tests BK outpeforms this module when the distance is greater than 2.

The data structure has no knowledge of the distance and thus it does not need to be rebuilt if different edit distances are needed. However this means that it cannot work with arbitrary EditDistance instances are functions need to be defined manually to generate the candidates. In this case levenshtein uses deletions, replaces, and insertions to generate the candidates; while damerauLevenshtein also uses transpositions.

Synopsis

Type

data TSTSet sym Source

Instances

Ord sym => Eq (TSTSet sym) 
(Show sym, Ord sym) => Show (TSTSet sym) 

Operations

empty :: Ord sym => TSTSet symSource

insert :: (Ord sym, ListLike full sym) => full -> TSTSet sym -> TSTSet symSource

levenshtein :: (Ord sym, ListLike full sym) => Int -> full -> TSTSet sym -> [(full, Distance Levenshtein)]Source

damerauLevenshtein :: (Ord sym, ListLike full sym) => Int -> full -> TSTSet sym -> [(full, Distance DamerauLevenshtein)]Source

Candidates generation

The following functions generate candidates distant 1 from the given word, and they are reexported for completeness.

deletions and transpositions do not need to wildcard the word, while replaces and insertions do since we are adding characters.

data WildCard a Source

Constructors

WildCard 
El a 

Instances

Eq a => Eq (WildCard a) 
(Eq (WildCard a), Ord a) => Ord (WildCard a) 
Show a => Show (WildCard a) 

Operations

deletions :: [a] -> [[a]]Source

transpositions :: [a] -> [[a]]Source