freq-0.1.0.4: Are you ready to get freaky?

Safe HaskellNone
LanguageHaskell2010

Freq.Internal

Contents

Description

This is the internal module to Freq. The primary differences are that this module exports the typeclass Freaky, as well as the data constructors of FreqTrain and Freq.

Synopsis

Frequency table type

newtype FreqTrain Source #

A FreqTrain is a digram-based frequency table.

One can construct a FreqTrain with train, trainWith, or trainWithMany.

One can use a trained FreqTrain with prob and measure.

mappend == <> will add the values of each of the matching keys.

It is highly recommended to convert a FreqTrain to a Freq with tabulate before using the trained model, because Freqs have O(1) reads as well as significantly faster constant-time operations, however keep in mind that Freqs cannot be neither modified nor converted back to a FreqTrain.

Constructors

FreqTrain 

Instances

Eq FreqTrain Source # 
Data FreqTrain Source # 

Methods

gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> FreqTrain -> c FreqTrain #

gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c FreqTrain #

toConstr :: FreqTrain -> Constr #

dataTypeOf :: FreqTrain -> DataType #

dataCast1 :: Typeable (* -> *) t => (forall d. Data d => c (t d)) -> Maybe (c FreqTrain) #

dataCast2 :: Typeable (* -> * -> *) t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c FreqTrain) #

gmapT :: (forall b. Data b => b -> b) -> FreqTrain -> FreqTrain #

gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> FreqTrain -> r #

gmapQr :: (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> FreqTrain -> r #

gmapQ :: (forall d. Data d => d -> u) -> FreqTrain -> [u] #

gmapQi :: Int -> (forall d. Data d => d -> u) -> FreqTrain -> u #

gmapM :: Monad m => (forall d. Data d => d -> m d) -> FreqTrain -> m FreqTrain #

gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> FreqTrain -> m FreqTrain #

gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> FreqTrain -> m FreqTrain #

Ord FreqTrain Source # 
Read FreqTrain Source # 
Show FreqTrain Source # 
Semigroup FreqTrain Source # 
Monoid FreqTrain Source # 
NFData FreqTrain Source # 

Methods

rnf :: FreqTrain -> () #

Freaky FreqTrain Source # 

Methods

prob :: FreqTrain -> Word8 -> Word8 -> Double Source #

Construction

empty :: FreqTrain Source #

O(1). The empty frequency table.

singleton Source #

Arguments

:: Word8

Outer key

-> Word8

Inner key

-> Double

Weight

-> FreqTrain

The singleton frequency table

O(1). A Frequency table with a single entry.

tabulate :: FreqTrain -> Freq Source #

Optimise a FreqTrain for O(1) read access.

Training

train :: ByteString -> FreqTrain Source #

Given a ByteString consisting of training data, build a Frequency table.

trainWith Source #

Arguments

:: FilePath

FilePath containing training data

-> IO FreqTrain

Frequency table generated as a result of training, inside of IO.

Given a FilePath containing training data, build a Frequency table inside of the IO monad.

trainWithMany Source #

Arguments

:: Foldable t 
=> t FilePath

FilePaths containing training data

-> IO FreqTrain

Frequency table generated as a result of training, inside of IO.

Given a list of FilePath containing training data, build a Frequency table inside of the IO monad.

Using a trained model

data Freq Source #

A variant of FreqTrain that holds identical information but is optimised for reads. There are no operations that imbue a Freq with additional information.

Reading from a Freq is orders of magnitude faster than reading from a FreqTrain. It is highly recommended that you use your trained model by first converting a FreqTrain to a Freq with tabulate.

Constructors

Freq 

Fields

  • _Dim :: !Int

    Width and height of square 2d array

  • _2d :: !ByteArray

    Square two-dimensional array of Double, maps first char and second char to probability

  • _Flat :: !ByteArray

    Array of Word8, length 256, acts as map from Word8 to table row/column index

Instances

measure :: Freaky a => a -> ByteString -> Double Source #

Given a Frequency table and a ByteString, measure returns the probability that the ByteString is not randomised. The accuracy of measure is is heavily affected by your training data.

class Freaky a where Source #

Freaky is a typeclass that wraps the prob function, which allows for an extensible definition of measure.

It is used internally.

Minimal complete definition

prob

Methods

prob :: a -> Word8 -> Word8 -> Double Source #

Given a Frequency table and characters c1 and c2, what is the probability that c1 follows c2?

Instances

Pretty Printing

prettyFreqTrain :: FreqTrain -> IO () Source #

Pretty-print a FreqTrain.