| Safe Haskell | None |
|---|---|
| Language | Haskell2010 |
Data.FuzzySet.Internal
Documentation
getMatch :: GetContext -> Size -> [(Double, Text)] Source
results :: GetContext -> Size -> [(Double, Text)] Source
Arguments
| :: Text | An input string |
| -> Size | The gram size n, which must be at least 2 |
| -> HashMap Text Int | A mapping from n-gram keys to the number of occurrences of the
key in the list returned by |
Normalize the input string, call grams on the normalized input, and then
translate the result to a HashMap with the n-grams as keys and Int
values corresponding to the number of occurences of the key in the
generated gram list.
>>>gramMap "xxxx" 2fromList [("-x",1), ("xx",3), ("x-",1)]
>>>Data.HashMap.Strict.lookup "nts" (gramMap "intrent'srestaurantsomeoftrent'saunt'santswantsamtorentsomepants" 3)Just 8
Arguments
| :: Text | An input string |
| -> Size | The variable n, which must be at least 2 |
| -> [Text] | A k-length list of grams of size n, with (k = s − n + 3) |
Break apart the normalized input string into a list of n-grams. For instance, the string "Destroido Corp." is first normalized into the form "destroido corp", and then enclosed in hyphens, so that it becomes "-destroido corp-". The 3-grams generated from this normalized string are
"-de", "des", "est", "str", "tro", "roi", "oid", "ido", "do ", "o c", " co", "cor", "orp", "rp-"
Given a normalized string of length s, we take all substrings of length n, letting the offset range from (0 text{ to } s + 2 − n). The number of n-grams for a normalized string of length s is thus (s + 2 − n + 1 = s − n + 3), where (0 < n < s − 2).