| Portability | portable |
|---|---|
| Stability | experimental |
| Maintainer | amy@nualeargais.ie |
| Safe Haskell | Safe-Inferred |
Data.Datamining.Clustering.SOM
Contents
Description
A Kohonen Self-organising Map (SOM). A SOM maps input patterns onto a regular grid (usually two-dimensional) where each node in the grid is a model of the input data, and does so using a method which ensures that any topological relationships within the input data are also represented in the grid. This implementation supports the use of non-numeric patterns.
In layman's terms, a SOM can be useful when you you want to discover the underlying structure of some data. A tutorial is available at https://github.com/mhwombat/som/wiki
References:
- Kohonen, T. (1982). Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43 (1), 59–69.
- class Pattern p v | p -> v where
- difference :: p -> p -> v
- makeSimilar :: p -> v -> p -> p
- train :: (Ord v, Pattern p v, Grid g s k) => (Int -> v) -> GridMap g k p -> p -> GridMap g k p
- trainBatch :: (Ord v, Grid g s k, Pattern p v) => (Int -> v) -> GridMap g k p -> [p] -> GridMap g k p
- classify :: (Ord v, Pattern p v) => GridMap g k p -> p -> k
- classifyAndTrain :: (Eq k, Ord v, Pattern p v, Grid g s k) => (Int -> v) -> GridMap g k p -> p -> (k, GridMap g k p)
- diffs :: Pattern p v => GridMap g k p -> p -> GridMap g k v
- differences :: Pattern p v => p -> GridMap g k p -> GridMap g k v
- diffAndTrain :: (Eq k, Ord v, Pattern p v, Grid g s k) => (Int -> v) -> GridMap g k p -> p -> (GridMap g k v, GridMap g k p)
- normalise :: Floating a => [a] -> NormalisedVector a
- data NormalisedVector a
- scale :: Fractional a => [(a, a)] -> [a] -> ScaledVector a
- data ScaledVector a
- adjustVector :: (Num a, Ord a, Eq a) => [a] -> a -> [a] -> [a]
- euclideanDistanceSquared :: Num a => [a] -> [a] -> a
- gaussian :: Double -> Double -> Int -> Double
Documentation
class Pattern p v | p -> v whereSource
A pattern to be learned or classified by a self-organising map.
Methods
difference :: p -> p -> vSource
Compares two patterns and returns a non-negative number
representing how different the patterns are. A result of 0
indicates that the patterns are identical.
makeSimilar :: p -> v -> p -> pSource
returns a modified copy of
makeSimilar target amount patternpattern that is more similar to target than pattern is. The
magnitude of the adjustment is controlled by the amount
parameter, which should be a number between 0 and 1. Larger
values for amount permit greater adjustments. If amount=1,
the result should be identical to the target. If amount=0,
the result should be the unmodified pattern.
Instances
| (Fractional a, Ord a, Eq a) => Pattern (ScaledVector a) a | |
| (Floating a, Fractional a, Ord a, Eq a) => Pattern (NormalisedVector a) a |
Using the SOM
train :: (Ord v, Pattern p v, Grid g s k) => (Int -> v) -> GridMap g k p -> p -> GridMap g k pSource
If f d is a function that returns the learning rate to apply to a
node based on its distance dfrom the node that best matches the
input pattern, then returns a modified copy
of the classifier train f c patternc that has partially learned the target.
trainBatch :: (Ord v, Grid g s k, Pattern p v) => (Int -> v) -> GridMap g k p -> [p] -> GridMap g k pSource
Same as train, but applied to multiple patterns.
classify :: (Ord v, Pattern p v) => GridMap g k p -> p -> kSource
classify c pattern returns the position of the node in c
whose pattern best matches the input pattern.
classifyAndTrain :: (Eq k, Ord v, Pattern p v, Grid g s k) => (Int -> v) -> GridMap g k p -> p -> (k, GridMap g k p)Source
If f is a function that returns the learning rate to apply to a
node based on its distance from the node that best matches the
target, then returns a tuple
containing the position of the node in classifyAndTrain f c targetc whose pattern best
matches the input target, and a modified copy of the classifier
c that has partially learned the target.
Invoking classifyAndTrain f c p may be faster than invoking
(p , but they should give identical
results.
classify c, train f c p)
diffs :: Pattern p v => GridMap g k p -> p -> GridMap g k vSource
returns the positions of all nodes in
diffs c patternc, paired with the difference between pattern and the node's
pattern.
differences :: Pattern p v => p -> GridMap g k p -> GridMap g k vSource
diffAndTrain :: (Eq k, Ord v, Pattern p v, Grid g s k) => (Int -> v) -> GridMap g k p -> p -> (GridMap g k v, GridMap g k p)Source
If f is a function that returns the learning rate to apply to a
node based on its distance from the node that best matches the
target, then returns a tuple
containing:
1. The positions of all nodes in diffAndTrain f c targetc, paired with the difference
between pattern and the node's pattern
2. A modified copy of the classifier c that has partially
learned the target.
Invoking diffAndTrain f c p may be faster than invoking
(p , but they should give identical
results.
differences c, train f c p)
Numeric vectors as patterns
Normalised vectors
normalise :: Floating a => [a] -> NormalisedVector aSource
Normalises a vector
data NormalisedVector a Source
A vector that has been normalised, i.e., the magnitude of the vector = 1.
Instances
| Show a => Show (NormalisedVector a) | |
| (Floating a, Fractional a, Ord a, Eq a) => Pattern (NormalisedVector a) a |
Scaled vectors
scale :: Fractional a => [(a, a)] -> [a] -> ScaledVector aSource
Given a vector qs of pairs of numbers, where each pair represents
the maximum and minimum value to be expected at each position in
xs, scales the vector scale qs xsxs element by element,
mapping the maximum value expected at that position to one, and the
minimum value to zero.
data ScaledVector a Source
A vector that has been scaled so that all elements in the vector
are between zero and one. To scale a set of vectors, use
. Alternatively, if you can identify a maximum and
minimum value for each element in a vector, you can scale
individual vectors using scaleAll.
scale
Instances
| Show a => Show (ScaledVector a) | |
| (Fractional a, Ord a, Eq a) => Pattern (ScaledVector a) a |
Useful functions
If you wish to use a SOM with raw numeric vectors, use no-warn-orphans
and add the following to your code:
instance (Floating a, Fractional a, Ord a, Eq a) ⇒ Pattern [a] a where difference = euclideanDistanceSquared makeSimilar = adjustVector
adjustVector :: (Num a, Ord a, Eq a) => [a] -> a -> [a] -> [a]Source
adjusts adjustVector target amount vectorvector to move it
closer to target. The amount of adjustment is controlled by the
learning rate r, which is a number between 0 and 1. Larger values
of r permit more adjustment. If r=1, the result will be
identical to the target. If amount=0, the result will be the
unmodified pattern.
euclideanDistanceSquared :: Num a => [a] -> [a] -> aSource
Calculates the square of the Euclidean distance between two vectors.
gaussian :: Double -> Double -> Int -> DoubleSource
Calculates ce^(-d^2/2w^2).
This form of the Gaussian function is useful as a learning rate
function. In , gaussian c w dc specifies the highest learning
rate, which will be applied to the SOM node that best matches the
input pattern. The learning rate applied to other nodes will be
applied based on their distance d from the best matching node.
The value w controls the 'width' of the Gaussian. Higher values
of w cause the learning rate to fall off more slowly with
distance.