Safe Haskell | None |
---|---|
Language | Haskell2010 |
Map between String
s that represent characters and their Int
-based
representation.
NOTE filtering the scores list and creating a single bigram map takes about 70 seconds.
NOTE A single bigram map costs around 160 MByte ram. This includes the
overhead for actually storing the bigrams once (creating pointers instead of
multiple copied Bigram
data structures.
- data Bigram = Bigram {}
- withDefault :: Double -> [ByteString] -> (Double, [ByteString])
- parseLine :: ByteString -> (BTI, BTI, Bigram, Bigram, Double)
- type Lang = BTI
- type Line = (Lang, Lang, Bigram, Bigram, Double)
- type Scores = HashMap (Bigram :!: Bigram) Double
- data Mapping = Mapping {}
- lines2mapping :: [Line] -> Mapping
- emptyMapping :: Mapping
- mkMapping :: Mapping -> [Line] -> Mapping
- generateLookups :: Set BTI -> Double -> ByteString -> Mapping
Documentation
withDefault :: Double -> [ByteString] -> (Double, [ByteString]) Source
Try to read the first line to figure out if there is a default score there
lines2mapping :: [Line] -> Mapping Source
generateLookups :: Set BTI -> Double -> ByteString -> Mapping Source
Given a set of acceptable languages, a default score, and the lazy
bytestring of scores, create the Mapping
of languages and scores.