Learning-0.1.0: The most frequently used machine learning tools

Safe HaskellNone
LanguageHaskell2010

Learning

Contents

Description

Machine learning utilities

A micro library containing the most common machine learning tools. Check also the mltool package https://hackage.haskell.org/package/mltool.

Synopsis

Datasets

data Dataset a b Source #

A dataset representation for supervised learning

Constructors

Dataset 

Fields

fromList :: [(a, b)] -> Dataset a b Source #

Create a Dataset from list of samples (first) and labels (second)

Principal components analysis

data PCA Source #

Principal components analysis tools

Constructors

PCA 

Fields

pca Source #

Arguments

:: Int

Number of principal components to preserve

-> [Vector Double]

Observations

-> PCA 

Principal components analysis resulting in PCA tools

pca' Source #

Arguments

:: [Vector Double]

Data samples

-> (Matrix Double, Vector Double) 

Compute the covariance matrix sigma and return its eigenvectors u' and eigenvalues s

pcaVariance Source #

Arguments

:: Double

Retained variance, %

-> [Vector Double]

Observations

-> PCA 

Perform PCA using the minimal number of principal components required to retain given variance

Supervised learning

type Teacher = Matrix Double Source #

Teacher matrix

0 0 0 0 0
0 0 0 0 0
1 1 1 1 1 <- Desired class index is 2
0 0 0 0 0 <- Number of classes is 4
        ^
  5 repetitions

teacher Source #

Arguments

:: Int

Number of classes (labels)

-> Int

Desired class index (starting from zero)

-> Int

Number of repeated columns in teacher matrix

-> Teacher 

Create a binary Teacher matrix with ones row corresponding to the desired class index

newtype Classifier a Source #

Classifier function that maps some measurements as matrix columns and corresponding features as rows, into a categorical output.

Constructors

Classifier 

Fields

newtype Regressor Source #

Regressor function that maps some feature matrix into a continuous multidimensional output. The feature matrix is expected to have columns corresponding to measurements (data points) and rows, features.

Constructors

Regressor 

type Readout = Matrix Double Source #

Linear readout (matrix)

learnClassifier Source #

Arguments

:: (Storable a, Eq a) 
=> Vector a

All possible outcomes (classes) list

-> Matrix Double

Network state (nonlinear response) where each matrix column corresponds to a measurement (data point) and each row corresponds to a feature

-> Matrix Double

Horizontally concatenated Teacher matrices where each row corresponds to a desired class

-> Either String (Classifier a) 

Perform supervised learning (ridge regression) and create a linear Classifier function. The regression is run with regularization parameter μ = 1e-4.

learnRegressor Source #

Arguments

:: Matrix Double

Feature matrix with data points (measurements) as colums and features as rows

-> Matrix Double

Desired outputs matrix corresponding to data point columns. In case of scalar (one-dimensional) prediction output, it should be a single row matrix.

-> Either String Regressor 

Perform supervised learning (ridge regression) and create a linear Regressor function.

learn' Source #

Arguments

:: Matrix Double

Measurements (feature matrix)

-> Matrix Double

Horizontally concatenated Teacher matrices

-> Maybe Readout 

Create a linear Readout using the ridge regression. Similar to learnRegressor, but instead of a Regressor function a (already transposed) Readout matrix may be returned.

scores Source #

Arguments

:: Readout

Readout matrix

-> Matrix Double

Network state

-> Vector Double 

Evaluate the network state (nonlinear response) according to some Readout matrix. Used by classification strategies such as winnerTakesAll.

winnerTakesAll Source #

Arguments

:: (Storable a, Eq a) 
=> Readout

Readout matrix

-> Vector a

Vector of possible classes (labels)

-> Matrix Double

Input matrix

-> a

Label

Winner-takes-all classification method

Evaluation

Classification

accuracy :: (Eq lab, Fractional acc) => [lab] -> [lab] -> acc Source #

Accuracy of classification, 100% - errorRate

>>> accuracy [1,2,3,4] [1,2,3,7]
75.0

errorRate :: (Eq a, Fractional err) => [a] -> [a] -> err Source #

Error rate in %, an error measure for classification tasks

>>> errorRate [1,2,3,4] [1,2,3,7]
25.0

errors :: Eq lab => [(lab, lab)] -> [(lab, lab)] Source #

Pairs of misclassified and correct values

>>> errors $ zip ['x','y','z'] ['x','b','a']
[('y','b'),('z','a')]

showConfusion Source #

Arguments

:: (Ord lab, Eq lab, Show lab) 
=> [lab]

Target labels

-> [lab]

Predicted labels

-> String 

Confusion matrix normalized by row: ASCII representation.

Note: it is assumed that target (true) labels list contains all possible labels.

          |  Predicted
       ---+------------
          | _ _ _ _ _
     True | _ _ _ _ _
          | _ _ _ _ _
    label | _ _ _ _ _
          | _ _ _ _ _
>>> putStr $ showConfusion [1, 2, 3, 1] [1, 2, 3, 2]
      1     2     3
1   50.0  50.0   0.0
2    0.0 100.0   0.0
3    0.0   0.0 100.0

confusion Source #

Arguments

:: (Ord lab, Eq lab) 
=> Normalize

Normalize ByRow or ByColumn

-> [lab]

Target labels

-> [lab]

Predicted labels

-> Map (lab, lab) Double

Map keys: (target, predicted), values: normalized confusion

Normalized confusion matrix for arbitrary number of classes

data Normalize Source #

Normalization strategies for confusion matrix

Constructors

ByRow 
ByColumn 

confusion' Source #

Arguments

:: (Ord lab, Eq lab) 
=> [lab]

Target labels

-> [lab]

Predicted labels

-> Map (lab, lab) Int

Map keys: (target, predicted), values: confusion count

Confusion matrix for arbitrary number of classes (not normalized)

Regression

nrmse Source #

Arguments

:: (Storable a, Floating a) 
=> Vector a

Target signal

-> Vector a

Predicted signal

-> a

NRMSE

Normalized root mean square error (NRMSE), one of the most common error measures for regression tasks