language-spelling-0.3.2: Various tools to detect/correct mistakes in words

Safe HaskellNone




An implementation of Search based on a ternary search tree (TSTSet):

The searches are performed by manually generating the close word by deleting, transposing, or adding wildcards to match additional characters.

This makes this structure fast for small distances with a small number of generated candidates, but impractical for even slightly larger distances - in my tests BK outpeforms this module when the distance is greater than 2.

The data structure has no knowledge of the distance and thus it does not need to be rebuilt if different edit distances are needed. However this means that it cannot work with arbitrary EditDistance instances are functions need to be defined manually to generate the candidates. In this case levenshtein uses deletions, replaces, and insertions to generate the candidates; while damerauLevenshtein also uses transpositions.



data TSTSet sym


Ord sym => Eq (TSTSet sym) 
(Show sym, Ord sym) => Show (TSTSet sym) 


empty :: Ord sym => TSTSet sym

insert :: (Ord sym, ListLike full sym) => full -> TSTSet sym -> TSTSet symSource

levenshtein :: (Ord sym, ListLike full sym) => Int -> full -> TSTSet sym -> [(full, Distance Levenshtein)]Source

damerauLevenshtein :: (Ord sym, ListLike full sym) => Int -> full -> TSTSet sym -> [(full, Distance DamerauLevenshtein)]Source

Candidates generation

The following functions generate candidates distant 1 from the given word, and they are reexported for completeness.

deletions and transpositions do not need to wildcard the word, while replaces and insertions do since we are adding characters.

data WildCard a


El a 


Eq a => Eq (WildCard a) 
(Eq (WildCard a), Ord a) => Ord (WildCard a) 
Show a => Show (WildCard a) 

type WildList a = [WildCard a]


deletions :: [a] -> [[a]]Source

transpositions :: [a] -> [[a]]Source