Safe Haskell	None
Language	Haskell2010

Learning

Contents

Datasets
Principal components analysis
Supervised learning
Evaluation
- Classification
- Regression

Description

Machine learning utilities

A micro library containing the most common machine learning tools. Check also the mltool package https://hackage.haskell.org/package/mltool.

Synopsis

Datasets

data Dataset a b Source #

A dataset representation for supervised learning

Constructors

Dataset
Fields _samples :: [a] _labels :: [b] toList :: [(a, b)]

fromList :: [(a, b)] -> Dataset a b Source #

Create a Dataset from list of samples (first) and labels (second)

Principal components analysis

data PCA Source #

Principal components analysis tools

Constructors

PCA
Fields _u :: Matrix Double Compression matrix U _compress :: Vector Double -> Matrix Double Compression function _decompress :: Matrix Double -> Vector Double Inverse to compression function

pca Source #

Arguments

:: Int	Number of principal components to preserve
-> [Vector Double]	Observations
-> PCA

Principal components analysis resulting in PCA tools

pca' Source #

Arguments

:: [Vector Double]	Data samples
-> (Matrix Double, Vector Double)

Compute the covariance matrix sigma and return its eigenvectors u' and eigenvalues s

pcaVariance Source #

Arguments

:: Double	Retained variance, %
-> [Vector Double]	Observations
-> PCA

Perform PCA using the minimal number of principal components required to retain given variance

Supervised learning

type Teacher = Matrix Double Source #

Teacher matrix

0 0 0 0 0
0 0 0 0 0
1 1 1 1 1 <- Desired class index is 2
0 0 0 0 0 <- Number of classes is 4
        ^
  5 repetitions

teacher Source #

Arguments

:: Int	Number of classes (labels)
-> Int	Desired class index (starting from zero)
-> Int	Number of repeated columns in teacher matrix
-> Teacher

Create a binary Teacher matrix with ones row corresponding to the desired class index

newtype Classifier a Source #

Classifier function that maps some measurements as matrix columns and corresponding features as rows, into a categorical output.

Constructors

Classifier
Fields classify :: Matrix Double -> a

newtype Regressor Source #

Regressor function that maps some feature matrix into a continuous multidimensional output. The feature matrix is expected to have columns corresponding to measurements (data points) and rows, features.

Constructors

Regressor
Fields predict :: Matrix Double -> Matrix Double

type Readout = Matrix Double Source #

Linear readout (matrix)

learnClassifier Source #

Arguments

:: (Storable a, Eq a)
=> Vector a	All possible outcomes (classes) list
-> Matrix Double	Network state (nonlinear response) where each matrix column corresponds to a measurement (data point) and each row corresponds to a feature
-> Matrix Double	Horizontally concatenated `Teacher` matrices where each row corresponds to a desired class
-> Either String (Classifier a)

Perform supervised learning (ridge regression) and create a linear Classifier function. The regression is run with regularization parameter μ = 1e-4.

learnRegressor Source #

Arguments

:: Matrix Double	Feature matrix with data points (measurements) as colums and features as rows
-> Matrix Double	Desired outputs matrix corresponding to data point columns. In case of scalar (one-dimensional) prediction output, it should be a single row matrix.
-> Either String Regressor

Perform supervised learning (ridge regression) and create a linear Regressor function.

learn' Source #

Arguments

:: Matrix Double	Measurements (feature matrix)
-> Matrix Double	Horizontally concatenated `Teacher` matrices
-> Maybe Readout

Create a linear Readout using the ridge regression. Similar to learnRegressor, but instead of a Regressor function a (already transposed) Readout matrix may be returned.

scores Source #

Arguments

:: Readout	`Readout` matrix
-> Matrix Double	Network state
-> Vector Double

Evaluate the network state (nonlinear response) according to some Readout matrix. Used by classification strategies such as winnerTakesAll.

winnerTakesAll Source #

Arguments

:: (Storable a, Eq a)
=> Readout	`Readout` matrix
-> Vector a	Vector of possible classes (labels)
-> Matrix Double	Input matrix
-> a	Label

Winner-takes-all classification method

Evaluation

Classification

accuracy :: (Eq lab, Fractional acc) => [lab] -> [lab] -> acc Source #

Accuracy of classification, 100% - errorRate

>>> accuracy [1,2,3,4] [1,2,3,7]
75.0

errorRate :: (Eq a, Fractional err) => [a] -> [a] -> err Source #

Error rate in %, an error measure for classification tasks

>>> errorRate [1,2,3,4] [1,2,3,7]
25.0

errors :: Eq lab => [(lab, lab)] -> [(lab, lab)] Source #

Pairs of misclassified and correct values

>>> errors $ zip ['x','y','z'] ['x','b','a']
[('y','b'),('z','a')]

showConfusion Source #

Arguments

:: (Ord lab, Eq lab, Show lab)
=> [lab]	Target labels
-> [lab]	Predicted labels
-> String

Confusion matrix normalized by row: ASCII representation.

Note: it is assumed that target (true) labels list contains all possible labels.

          |  Predicted
       ---+------------
          | _ _ _ _ _
     True | _ _ _ _ _
          | _ _ _ _ _
    label | _ _ _ _ _
          | _ _ _ _ _

>>> putStr $ showConfusion [1, 2, 3, 1] [1, 2, 3, 2]
      1     2     3
1   50.0  50.0   0.0
2    0.0 100.0   0.0
3    0.0   0.0 100.0

confusion Source #

Arguments

:: (Ord lab, Eq lab)
=> Normalize	Normalize `ByRow` or `ByColumn`
-> [lab]	Target labels
-> [lab]	Predicted labels
-> Map (lab, lab) Double	Map keys: (target, predicted), values: normalized confusion

Normalized confusion matrix for arbitrary number of classes

data Normalize Source #

Normalization strategies for confusion matrix

Constructors

ByRow
ByColumn

Instances

Eq Normalize Source #
Methods (==) :: Normalize -> Normalize -> Bool # (/=) :: Normalize -> Normalize -> Bool #
Show Normalize Source #
Methods showsPrec :: Int -> Normalize -> ShowS # show :: Normalize -> String # showList :: [Normalize] -> ShowS #

confusion' Source #

Arguments

:: (Ord lab, Eq lab)
=> [lab]	Target labels
-> [lab]	Predicted labels
-> Map (lab, lab) Int	Map keys: (target, predicted), values: confusion count

Confusion matrix for arbitrary number of classes (not normalized)

Regression

nrmse Source #

Arguments

:: (Storable a, Floating a)
=> Vector a	Target signal
-> Vector a	Predicted signal
-> a	NRMSE

Normalized root mean square error (NRMSE), one of the most common error measures for regression tasks