EM for a mixture of k one-dimensional Gaussians. This procedure tends to produce NaNs whenever more Gaussians are being selected than are called for. This is rather convenient. ;-)
TODO cite paper
Documentation
emFix :: Data -> Theta -> ThetaSource
Find an optimal set of parameters Theta
. The additional takeWhile (not
. isnan . fst) makes sure that in cases of overfitting, emFix
does
terminate. Due to the way we check and take, in case of NaNs, the returned
values will be NaNs (checking fst, returning snd).
emStarts :: Int -> Data -> ThetaSource
Given a set of Data
and a number k
of Gaussian peaks, try to find the
optimal GMM. This is done by trying each data point as mu for each Gaussian.
Note that this will be rather slow for larger k
(larger than, say 2 or 3).
In that case, a random-drawing method should be chosen.
TODO xs' -> xs sorting makes me cry!