Safe Haskell | None |
---|---|
Language | Haskell2010 |
Machine learning utilities
A micro library containing the most common machine learning tools. Check also the mltool package https://hackage.haskell.org/package/mltool.
- data Dataset a b = Dataset {}
- fromList :: [(a, b)] -> Dataset a b
- data PCA = PCA {}
- pca :: Int -> [Vector Double] -> PCA
- pca' :: [Vector Double] -> (Matrix Double, Vector Double)
- pcaVariance :: Double -> [Vector Double] -> PCA
- type Teacher = Matrix Double
- teacher :: Int -> Int -> Int -> Teacher
- newtype Classifier a = Classifier {}
- newtype Regressor = Regressor {}
- type Readout = Matrix Double
- learnClassifier :: (Storable a, Eq a) => Vector a -> Matrix Double -> Matrix Double -> Either String (Classifier a)
- learnRegressor :: Matrix Double -> Matrix Double -> Either String Regressor
- learn' :: Matrix Double -> Matrix Double -> Maybe Readout
- scores :: Readout -> Matrix Double -> Vector Double
- winnerTakesAll :: (Storable a, Eq a) => Readout -> Vector a -> Matrix Double -> a
- accuracy :: (Eq lab, Fractional acc) => [lab] -> [lab] -> acc
- errorRate :: (Eq a, Fractional err) => [a] -> [a] -> err
- errors :: Eq lab => [(lab, lab)] -> [(lab, lab)]
- showConfusion :: (Ord lab, Eq lab, Show lab) => [lab] -> [lab] -> String
- confusion :: (Ord lab, Eq lab) => Normalize -> [lab] -> [lab] -> Map (lab, lab) Double
- data Normalize
- confusion' :: (Ord lab, Eq lab) => [lab] -> [lab] -> Map (lab, lab) Int
- nrmse :: (Storable a, Floating a) => Vector a -> Vector a -> a
Datasets
A dataset representation for supervised learning
fromList :: [(a, b)] -> Dataset a b Source #
Create a Dataset
from list of samples (first) and labels (second)
Principal components analysis
Principal components analysis tools
Principal components analysis resulting in PCA
tools
Compute the covariance matrix sigma
and return its eigenvectors u'
and eigenvalues s
Perform PCA using the minimal number of principal components required to retain given variance
Supervised learning
type Teacher = Matrix Double Source #
Teacher matrix
0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 <- Desired class index is 2 0 0 0 0 0 <- Number of classes is 4 ^ 5 repetitions
:: Int | Number of classes (labels) |
-> Int | Desired class index (starting from zero) |
-> Int | Number of repeated columns in teacher matrix |
-> Teacher |
Create a binary Teacher
matrix with ones row corresponding to
the desired class index
newtype Classifier a Source #
Classifier function that maps some measurements as matrix columns and corresponding features as rows, into a categorical output.
Regressor function that maps some feature matrix into a continuous multidimensional output. The feature matrix is expected to have columns corresponding to measurements (data points) and rows, features.
:: (Storable a, Eq a) | |
=> Vector a | All possible outcomes (classes) list |
-> Matrix Double | Network state (nonlinear response) where each matrix column corresponds to a measurement (data point) and each row corresponds to a feature |
-> Matrix Double | Horizontally concatenated |
-> Either String (Classifier a) |
Perform supervised learning (ridge regression) and create
a linear Classifier
function.
The regression is run with regularization parameter μ = 1e-4.
:: Matrix Double | Feature matrix with data points (measurements) as colums and features as rows |
-> Matrix Double | Desired outputs matrix corresponding to data point columns. In case of scalar (one-dimensional) prediction output, it should be a single row matrix. |
-> Either String Regressor |
Perform supervised learning (ridge regression) and create
a linear Regressor
function.
:: Matrix Double | Measurements (feature matrix) |
-> Matrix Double | Horizontally concatenated |
-> Maybe Readout |
Create a linear Readout
using the ridge regression.
Similar to learnRegressor
, but instead of a Regressor
function
a (already transposed) Readout
matrix may be returned.
Evaluate the network state (nonlinear response) according
to some Readout
matrix. Used by classification strategies
such as winnerTakesAll
.
:: (Storable a, Eq a) | |
=> Readout |
|
-> Vector a | Vector of possible classes (labels) |
-> Matrix Double | Input matrix |
-> a | Label |
Winner-takes-all classification method
Evaluation
Classification
accuracy :: (Eq lab, Fractional acc) => [lab] -> [lab] -> acc Source #
Accuracy of classification, 100% -
errorRate
>>>
accuracy [1,2,3,4] [1,2,3,7]
75.0
errorRate :: (Eq a, Fractional err) => [a] -> [a] -> err Source #
Error rate in %, an error measure for classification tasks
>>>
errorRate [1,2,3,4] [1,2,3,7]
25.0
errors :: Eq lab => [(lab, lab)] -> [(lab, lab)] Source #
Pairs of misclassified and correct values
>>>
errors $ zip ['x','y','z'] ['x','b','a']
[('y','b'),('z','a')]
Confusion matrix normalized by row: ASCII representation.
Note: it is assumed that target (true) labels list contains all possible labels.
| Predicted ---+------------ | _ _ _ _ _ True | _ _ _ _ _ | _ _ _ _ _ label | _ _ _ _ _ | _ _ _ _ _
>>>
putStr $ showConfusion [1, 2, 3, 1] [1, 2, 3, 2]
1 2 3 1 50.0 50.0 0.0 2 0.0 100.0 0.0 3 0.0 0.0 100.0
:: (Ord lab, Eq lab) | |
=> Normalize | |
-> [lab] | Target labels |
-> [lab] | Predicted labels |
-> Map (lab, lab) Double | Map keys: (target, predicted), values: normalized confusion |
Normalized confusion matrix for arbitrary number of classes
Normalization strategies for confusion
matrix
:: (Ord lab, Eq lab) | |
=> [lab] | Target labels |
-> [lab] | Predicted labels |
-> Map (lab, lab) Int | Map keys: (target, predicted), values: confusion count |
Confusion matrix for arbitrary number of classes (not normalized)