Portability | portable |
---|---|

Stability | experimental |

Maintainer | amy@nualeargais.ie |

Safe Haskell | Safe-Inferred |

A Kohonen Self-organising Map (SOM). A SOM maps input patterns onto a regular grid (usually two-dimensional) where each node in the grid is a model of the input data, and does so using a method which ensures that any topological relationships within the input data are also represented in the grid. This implementation supports the use of non-numeric patterns.

In layman's terms, a SOM can be useful when you you want to discover the underlying structure of some data. A tutorial is available at https://github.com/mhwombat/som/wiki.

NOTES:

- Version 5.0 fixed a bug in the

function. If you use`decayingGaussian`

(which uses this function), your SOM should now learn more quickly.`defaultSOM`

- The
`gaussian`

function has been removed because it is not as useful for SOMs as I originally thought. It was originally designed to be used as a factor in a learning function. However, in most cases the user will want to introduce a time decay into the exponent, rather than simply multiply by a factor.

References:

- Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43 (1), 59–69.

- data SOM gm k p
- defaultSOM :: Floating (Metric p) => gm p -> Metric p -> Metric p -> Int -> SOM gm k p
- customSOM :: gm p -> (Int -> Int -> Metric p) -> SOM gm k p
- decayingGaussian :: Floating a => a -> a -> Int -> Int -> Int -> a
- toGridMap :: GridMap gm p => SOM gm k p -> gm p
- trainNeighbourhood :: (Pattern p, Grid (gm p), GridMap gm p, Index (BaseGrid gm p) ~ Index (gm p)) => SOM gm k p -> Index (gm p) -> p -> SOM gm k p
- incrementCounter :: SOM gm k p -> SOM gm k p

# Construction

A Self-Organising Map (SOM).

Although `SOM`

implements `GridMap`

, most users will only need the
interface provided by `Data.Datamining.Clustering.Classifier`

. If
you chose to use the `GridMap`

functions, please note:

- The functions
`adjust`

, and`adjustWithKey`

do not increment the counter. You can do so manually with`incrementCounter`

. - The functions
`map`

and`mapWithKey`

are not implemented (they just return an`error`

). It would be problematic to implement them because the input SOM and the output SOM would have to have the same`Metric`

type.

(GridMap gm p, ~ * k (Index (BaseGrid gm p)), Pattern p, Grid (gm p), GridMap gm (Metric p), ~ * k (Index (gm p)), ~ * k (Index (BaseGrid gm (Metric p))), Ord (Metric p)) => Classifier (SOM gm) k p | |

Foldable gm => Foldable (SOM gm k) | |

(Foldable gm, GridMap gm p, Grid (BaseGrid gm p)) => GridMap (SOM gm k) p | |

Grid (gm p) => Grid (SOM gm k p) |

defaultSOM :: Floating (Metric p) => gm p -> Metric p -> Metric p -> Int -> SOM gm k pSource

Creates a classifier with a default (bell-shaped) learning
function. Usage is

, where:
`defaultSOM`

gm r w t

`gm`

- The geometry and initial models for this classifier.
A reasonable choice here is

, where`lazyGridMap`

g ps`g`

is a

, and`HexHexGrid`

`ps`

is a set of random patterns. `r`

- The learning rate to be applied to the BMU (Best Matching Unit) at time zero. The BMU is the model which best matches the current target pattern.
`w`

- The width of the bell curve at time zero.
`t`

- Controls how rapidly the learning rate decays. After this time, any learning done by the classifier will be negligible. We recommend setting this parameter to the number of patterns (or pattern batches) that will be presented to the classifier. An estimate is fine.

customSOM :: gm p -> (Int -> Int -> Metric p) -> SOM gm k pSource

Creates a classifier with a custom learning function.
Usage is

, where:
`customSOM`

gm g

`gm`

- The geometry and initial models for this classifier.
A reasonable choice here is

, where`lazyGridMap`

g ps`g`

is a

, and`HexHexGrid`

`ps`

is a set of random patterns. `f`

- A function used to adjust the models in the classifier. This function will be invoked with two parameters. The first parameter will indicate how many patterns (or pattern batches) have previously been presented to this classifier. Typically this is used to make the learning rate decay over time. The second parameter to the function is the grid distance from the node being updated to the BMU (Best Matching Unit). The output is the learning rate for that node (the amount by which the node's model should be updated to match the target). The learning rate should be between zero and one.

decayingGaussian :: Floating a => a -> a -> Int -> Int -> Int -> aSource

Configures a typical learning function for classifiers.

returns a bell curve-shaped
function. At time zero, the maximum learning rate (applied to the
BMU) is `decayingGaussian`

r w0 tMax`r`

, and the neighbourhood width is `w`

. Over time the bell
curve shrinks and the learning rate tapers off, until at time
`tMax`

, the learning rate is negligible.

# Deconstruction

toGridMap :: GridMap gm p => SOM gm k p -> gm pSource

Extracts the grid and current models from the SOM.

# Advanced control

trainNeighbourhood :: (Pattern p, Grid (gm p), GridMap gm p, Index (BaseGrid gm p) ~ Index (gm p)) => SOM gm k p -> Index (gm p) -> p -> SOM gm k pSource

Trains the specified node and the neighbourood around it to better
match a target.
Most users should use `train`

, which automatically determines
the BMU and trains it and its neighbourhood.

incrementCounter :: SOM gm k p -> SOM gm k pSource