# datasets: Classical data sets for statistics and machine learning

[ data, data-mining, library, machine-learning, mit, statistics ] [ Propose Tags ]

Classical machine learning and statistics datasets from the UCI Machine Learning Repository and other sources.

The datasets package defines two different kinds of datasets:

• small data sets which are directly (or indirectly with file-embed) embedded in the package as pure values and do not require network or IO to download the data set. This includes Iris, Anscombe and OldFaithful.

• other data sets which need to be fetched over the network with Numeric.Datasets.getDataset and are cached in a local temporary directory.

The datafiles/ directory of this package includes copies of a few famous datasets, such as Titanic, Nightingale and Michelson.

Example :

import Numeric.Datasets (getDataset)
import Numeric.Datasets.Iris (iris)
import Numeric.Datasets.Abalone (abalone)

main = do
-- The Iris data set is embedded
print (length iris)
print (head abas)