Safe Haskell	None
Language	Haskell2010

Freq.Internal

Contents

Frequency table type
Construction
Training
Using a trained model
Pretty Printing

Description

This is the internal module to Freq. The primary differences are that this module exports the typeclass Freaky, as well as the data constructors of FreqTrain and Freq.

Synopsis

Frequency table type

newtype FreqTrain Source #

A FreqTrain is a digram-based frequency table.

One can construct a FreqTrain with train, trainWith, or trainWithMany.

One can use a trained FreqTrain with prob and measure.

mappend == <> will add the values of each of the matching keys.

It is highly recommended to convert a FreqTrain to a Freq with tabulate before using the trained model, because Freqs have O(1) reads as well as significantly faster constant-time operations, however keep in mind that Freqs cannot be neither modified nor converted back to a FreqTrain.

Constructors

FreqTrain
Fields _getFreqTrain :: Map Word8 (Map Word8 Double)

Instances

Eq FreqTrain Source #
Methods (==) :: FreqTrain -> FreqTrain -> Bool # (/=) :: FreqTrain -> FreqTrain -> Bool #
Data FreqTrain Source #
Methods gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> FreqTrain -> c FreqTrain # gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c FreqTrain # toConstr :: FreqTrain -> Constr # dataTypeOf :: FreqTrain -> DataType # dataCast1 :: Typeable (* -> ) t => (forall d. Data d => c (t d)) -> Maybe (c FreqTrain) # dataCast2 :: Typeable ( -> * -> *) t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c FreqTrain) # gmapT :: (forall b. Data b => b -> b) -> FreqTrain -> FreqTrain # gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> FreqTrain -> r # gmapQr :: (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> FreqTrain -> r # gmapQ :: (forall d. Data d => d -> u) -> FreqTrain -> [u] # gmapQi :: Int -> (forall d. Data d => d -> u) -> FreqTrain -> u # gmapM :: Monad m => (forall d. Data d => d -> m d) -> FreqTrain -> m FreqTrain # gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> FreqTrain -> m FreqTrain # gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> FreqTrain -> m FreqTrain #
Ord FreqTrain Source #
Methods compare :: FreqTrain -> FreqTrain -> Ordering # (<) :: FreqTrain -> FreqTrain -> Bool # (<=) :: FreqTrain -> FreqTrain -> Bool # (>) :: FreqTrain -> FreqTrain -> Bool # (>=) :: FreqTrain -> FreqTrain -> Bool # max :: FreqTrain -> FreqTrain -> FreqTrain # min :: FreqTrain -> FreqTrain -> FreqTrain #
Read FreqTrain Source #
Methods readsPrec :: Int -> ReadS FreqTrain # readList :: ReadS [FreqTrain] # readPrec :: ReadPrec FreqTrain # readListPrec :: ReadPrec [FreqTrain] #
Show FreqTrain Source #
Methods showsPrec :: Int -> FreqTrain -> ShowS # show :: FreqTrain -> String # showList :: [FreqTrain] -> ShowS #
Semigroup FreqTrain Source #
Methods (<>) :: FreqTrain -> FreqTrain -> FreqTrain # sconcat :: NonEmpty FreqTrain -> FreqTrain # stimes :: Integral b => b -> FreqTrain -> FreqTrain #
Monoid FreqTrain Source #
Methods mempty :: FreqTrain # mappend :: FreqTrain -> FreqTrain -> FreqTrain # mconcat :: [FreqTrain] -> FreqTrain #
NFData FreqTrain Source #
Methods rnf :: FreqTrain -> () #
Freaky FreqTrain Source #
Methods prob :: FreqTrain -> Word8 -> Word8 -> Double Source #

Construction

empty :: FreqTrain Source #

O(1). The empty frequency table.

singleton Source #

Arguments

:: Word8	Outer key
-> Word8	Inner key
-> Double	Weight
-> FreqTrain	The singleton frequency table

O(1). A Frequency table with a single entry.

tabulate :: FreqTrain -> Freq Source #

Optimise a FreqTrain for O(1) read access.

Training

train :: ByteString -> FreqTrain Source #

Given a ByteString consisting of training data, build a Frequency table.

trainWith Source #

Arguments

:: FilePath	`FilePath` containing training data
-> IO FreqTrain	Frequency table generated as a result of training, inside of `IO`.

Given a FilePath containing training data, build a Frequency table inside of the IO monad.

trainWithMany Source #

Arguments

:: Foldable t
=> t FilePath	`FilePath`s containing training data
-> IO FreqTrain	Frequency table generated as a result of training, inside of `IO`.

Given a list of FilePath containing training data, build a Frequency table inside of the IO monad.

Using a trained model

data Freq Source #

A variant of FreqTrain that holds identical information but is optimised for reads. There are no operations that imbue a Freq with additional information.

Reading from a Freq is orders of magnitude faster than reading from a FreqTrain. It is highly recommended that you use your trained model by first converting a FreqTrain to a Freq with tabulate.

Constructors

Freq
Fields _Dim :: !Int Width and height of square 2d array _2d :: !ByteArray Square two-dimensional array of Double, maps first char and second char to probability _Flat :: !ByteArray Array of Word8, length 256, acts as map from Word8 to table row/column index

Instances

Show Freq Source #
Methods showsPrec :: Int -> Freq -> ShowS # show :: Freq -> String # showList :: [Freq] -> ShowS #
Freaky Freq Source #
Methods prob :: Freq -> Word8 -> Word8 -> Double Source #

measure :: Freaky a => a -> ByteString -> Double Source #

Given a Frequency table and a ByteString, measure returns the probability that the ByteString is not randomised. The accuracy of measure is is heavily affected by your training data.

class Freaky a where Source #

Freaky is a typeclass that wraps the prob function, which allows for an extensible definition of measure.

It is used internally.

Minimal complete definition

prob

Methods

prob :: a -> Word8 -> Word8 -> Double Source #

Given a Frequency table and characters c1 and c2, what is the probability that c1 follows c2?

Instances

Freaky Freq Source #
Methods prob :: Freq -> Word8 -> Word8 -> Double Source #
Freaky FreqTrain Source #
Methods prob :: FreqTrain -> Word8 -> Word8 -> Double Source #

Pretty Printing

prettyFreqTrain :: FreqTrain -> IO () Source #

Pretty-print a FreqTrain.