Copyright | (c) Keegan Carruthers-Smith, 2009 |
---|---|

License | BSD 3 Clause |

Maintainer | gershomb@gmail.com |

Stability | experimental |

Safe Haskell | Safe-Inferred |

Language | Haskell98 |

A simple implementation of the standard k-means clustering algorithm: http://en.wikipedia.org/wiki/K-means_clustering. K-means clustering partitions points into clusters, with each point belonging to the cluster with th nearest mean. As the general problem is NP hard, the standard algorithm, which is relatively rapid, is heuristic and not guaranteed to converge to a global optimum. Varying the input order, from which the initial clusters are generated, can yield different results. For degenerate and malicious cases, the algorithm may take exponential time.

# Documentation

kmeans :: Int -> [[Double]] -> [[[Double]]] Source

Cluster points in a Euclidian space, represented as lists of Doubles, into at most k clusters. The initial clusters are chosen arbitrarily.

kmeansGen :: (a -> [Double]) -> Int -> [a] -> [[a]] Source

A generalized kmeans function. This function operates not on points, but an arbitrary type which may be projected into a Euclidian space. Since the projection may be chosen freely, this allows for weighting dimensions to different degrees, etc.