statistics-dirichlet-0.6: Functions for working with Dirichlet densities and mixtures on vectors.

Portability portable experimental felipe.lessa@gmail.com Safe-Infered

Math.Statistics.Dirichlet

Description

This module re-exports functions from Math.Statistics.Dirichlet.Mixture and Math.Statistics.Dirichlet.Options in a more digestable way. Since this library is under-documented, I recommend reading the documentation of the symbols re-exported here.

This module does not use Math.Statistics.Dirichlet.Density in any way. If you don't need mixtures then you should probably use that module directly since it's faster and more reliable (less magic happens there).

Synopsis

Data types (re-exported)

A Dirichlet mixture.

Constructors

 DM FieldsdmWeights :: !(Vector Double)Weights of each density. dmDensities :: !MatrixValues of all parameters of all densities. This matrix has `length dmWeights` rows.

`empty q n x` is an "empty" Dirichlet mixture with `q` components and `n` parameters. Each component has size `n`, weight inversely proportional to its index and all alphas set to `x`.

type Component = (Double, [Double])Source

A list representation of a component of a Dirichlet mixture. Used by `fromList` and `toList` only.

`fromList xs` constructs a Dirichlet mixture from a non-empty list of components. Each component has a weight and a list of alpha values. The weights sum to 1, all lists must have the same number of values and every number must be non-negative. None of these preconditions are verified.

`toList dm` is the inverse of `fromList`, constructs a list of components from a Dirichlet mixture. There are no error conditions and `toList . fromList == id`.

Options (re-exported)

A vector used for deriving the parameters of a Dirichlet density or mixture.

A vector of training vectors. This is the only vector that is not unboxed (for obvious reasons).

newtype StepSize Source

Usually denoted by lowercase greek letter eta (η), size of each step in the gradient. Should be greater than zero and much less than one.

Constructors

 Step Double

type Delta = DoubleSource

Maximum difference between costs to consider that the process converged.

data Predicate Source

Predicate specifying when the training should be over.

Constructors

 Pred FieldsmaxIter :: !IntMaximum number of iterations. minDelta :: !DeltaMinimum delta to continue iterating. This is invariant of `deltaSteps`, which means that if `deltaSteps` is `2` then minDelta will be considered twice bigger to account for the different `deltaSteps`. deltaSteps :: !IntHow many estimation steps should be done before recalculating the delta. If `deltaSteps` is `1` then it will be recalculated on every step. maxWeightIter :: !IntMaximum number of iterations on each weight step. jumpDelta :: !DeltaUsed only when calculating mixtures. When the delta drops below this cutoff the computation changes from estimating the alphas to estimatating the weights and vice-versa. Should be greater than `minDelta`.

Instances

 Eq Predicate Read Predicate Show Predicate

data Reason Source

Reason why the derivation was over.

Constructors

 Delta The difference between applications of the cost function dropped below the minimum delta. In other words, it coverged. MaxIter The maximum number of iterations was reached while the delta was still greater than the minimum delta. CG Result CG_DESCENT returned this result, which brought the derivation process to a halt.

Instances

 Eq Reason Read Reason Show Reason

data Result a Source

Result of a deriviation.

Constructors

 Result Fieldsreason :: !ReasonReason why the derivation was over. iters :: !IntNumber of iterations spent. lastDelta :: !DeltaLast difference between costs. lastCost :: !DoubleLast cost (i.e. the cost of the result). result :: !aResult obtained.

Instances

 Eq a => Eq (Result a) Read a => Read (Result a) Show a => Show (Result a) NFData a => NFData (Result a)

Training data (re-exported)

Pre-processed training vectors (see `prepareTraining`).

Instances

 Eq TrainingData Show TrainingData

Prepares training vectors to be used as training data. Anything that depends only on the training vectors is precalculated here.

We also try to find columns where all training vectors are zero. Those columns are removed from the derivation process and every component will have zero value on that column. Note that at least one column should have non-zero training vectors.

Functions (re-exported)

Derive a Dirichlet mixture using a maximum likelihood method as described by Karplus et al (equation 25) using CG_DESCENT method by Hager and Zhang (see Numeric.Optimization.Algorithms.HagerZhang05). All training vectors should have the same length, however this is not verified.

Cost function for deriving a Dirichlet mixture (equation 18). This function is minimized by `derive`. Calculated using (17) and (54).