Learning-0.1.0: The most frequently used machine learning tools

Learning

Description

# Machine learning utilities

A micro library containing the most common machine learning tools. Check also the mltool package https://hackage.haskell.org/package/mltool.

Synopsis

# Datasets

data Dataset a b Source #

A dataset representation for supervised learning

Constructors

 Dataset Fields_samples :: [a] _labels :: [b] toList :: [(a, b)]

fromList :: [(a, b)] -> Dataset a b Source #

Create a Dataset from list of samples (first) and labels (second)

# Principal components analysis

data PCA Source #

Principal components analysis tools

Constructors

 PCA Fields_u :: Matrix DoubleCompression matrix U_compress :: Vector Double -> Matrix DoubleCompression function_decompress :: Matrix Double -> Vector DoubleInverse to compression function

Arguments

 :: Int Number of principal components to preserve -> [Vector Double] Observations -> PCA

Principal components analysis resulting in PCA tools

Arguments

 :: [Vector Double] Data samples -> (Matrix Double, Vector Double)

Compute the covariance matrix sigma and return its eigenvectors u' and eigenvalues s

Arguments

 :: Double Retained variance, % -> [Vector Double] Observations -> PCA

Perform PCA using the minimal number of principal components required to retain given variance

# Supervised learning

Teacher matrix

0 0 0 0 0
0 0 0 0 0
1 1 1 1 1 <- Desired class index is 2
0 0 0 0 0 <- Number of classes is 4
^
5 repetitions

Arguments

 :: Int Number of classes (labels) -> Int Desired class index (starting from zero) -> Int Number of repeated columns in teacher matrix -> Teacher

Create a binary Teacher matrix with ones row corresponding to the desired class index

newtype Classifier a Source #

Classifier function that maps some measurements as matrix columns and corresponding features as rows, into a categorical output.

Constructors

 Classifier Fieldsclassify :: Matrix Double -> a

newtype Regressor Source #

Regressor function that maps some feature matrix into a continuous multidimensional output. The feature matrix is expected to have columns corresponding to measurements (data points) and rows, features.

Constructors

 Regressor Fieldspredict :: Matrix Double -> Matrix Double

Arguments

 :: (Storable a, Eq a) => Vector a All possible outcomes (classes) list -> Matrix Double Network state (nonlinear response) where each matrix column corresponds to a measurement (data point) and each row corresponds to a feature -> Matrix Double Horizontally concatenated Teacher matrices where each row corresponds to a desired class -> Either String (Classifier a)

Perform supervised learning (ridge regression) and create a linear Classifier function. The regression is run with regularization parameter μ = 1e-4.

Arguments

 :: Matrix Double Feature matrix with data points (measurements) as colums and features as rows -> Matrix Double Desired outputs matrix corresponding to data point columns. In case of scalar (one-dimensional) prediction output, it should be a single row matrix. -> Either String Regressor

Perform supervised learning (ridge regression) and create a linear Regressor function.

Arguments

 :: Matrix Double Measurements (feature matrix) -> Matrix Double Horizontally concatenated Teacher matrices -> Maybe Readout

Create a linear Readout using the ridge regression. Similar to learnRegressor, but instead of a Regressor function a (already transposed) Readout matrix may be returned.

Arguments

 :: Readout Readout matrix -> Matrix Double Network state -> Vector Double

Evaluate the network state (nonlinear response) according to some Readout matrix. Used by classification strategies such as winnerTakesAll.

Arguments

 :: (Storable a, Eq a) => Readout Readout matrix -> Vector a Vector of possible classes (labels) -> Matrix Double Input matrix -> a Label

Winner-takes-all classification method

# Evaluation

## Classification

accuracy :: (Eq lab, Fractional acc) => [lab] -> [lab] -> acc Source #

Accuracy of classification, 100% - errorRate

>>> accuracy [1,2,3,4] [1,2,3,7]
75.0


errorRate :: (Eq a, Fractional err) => [a] -> [a] -> err Source #

Error rate in %, an error measure for classification tasks

>>> errorRate [1,2,3,4] [1,2,3,7]
25.0


errors :: Eq lab => [(lab, lab)] -> [(lab, lab)] Source #

Pairs of misclassified and correct values

>>> errors $zip ['x','y','z'] ['x','b','a'] [('y','b'),('z','a')]  Arguments  :: (Ord lab, Eq lab, Show lab) => [lab] Target labels -> [lab] Predicted labels -> String Confusion matrix normalized by row: ASCII representation. Note: it is assumed that target (true) labels list contains all possible labels.  | Predicted ---+------------ | _ _ _ _ _ True | _ _ _ _ _ | _ _ _ _ _ label | _ _ _ _ _ | _ _ _ _ _  >>> putStr$ showConfusion [1, 2, 3, 1] [1, 2, 3, 2]
      1     2     3
1   50.0  50.0   0.0
2    0.0 100.0   0.0
3    0.0   0.0 100.0


Arguments

 :: (Ord lab, Eq lab) => Normalize Normalize ByRow or ByColumn -> [lab] Target labels -> [lab] Predicted labels -> Map (lab, lab) Double Map keys: (target, predicted), values: normalized confusion

Normalized confusion matrix for arbitrary number of classes

data Normalize Source #

Normalization strategies for confusion matrix

Constructors

 ByRow ByColumn

Instances

 Source # Methods Source # MethodsshowList :: [Normalize] -> ShowS #

Arguments

 :: (Ord lab, Eq lab) => [lab] Target labels -> [lab] Predicted labels -> Map (lab, lab) Int Map keys: (target, predicted), values: confusion count

Confusion matrix for arbitrary number of classes (not normalized)

## Regression

Arguments

 :: (Storable a, Floating a) => Vector a Target signal -> Vector a Predicted signal -> a NRMSE

Normalized root mean square error (NRMSE), one of the most common error measures for regression tasks