hmatrix-nipals-0.2: NIPALS method for Principal Components Analysis on large data-sets.

Numeric.LinearAlgebra.NIPALS

Description

Nonlinear Iterative Partial Least Squares

Synopsis

# Simplified Interface

Calculate the first principal component of a set of samples.

Each row in the matrix is one sample. Note that this is transposed compared to the implementation of principal components using `svd` or `leftSV`

Example:

``` let (pc,scores,residuals) = firstPC \$ fromRows samples
```

This is calculated by providing a default estimate of the scores to `firstPCFromScores`

Calculate the first principal component of a set of samples given a starting estimate of the scores.

Each row in the matrix is one sample. Note that this is transposed compared to the implementation of principal components using `svd` or `leftSV`

The second argument is a starting guess for the score vector. If this is close to the actual score vector, then this will cause the algorthm to converge much faster.

Example:

``` let (pc,scores,residuals) = firstPCFromScores (fromRows samples) scoresGuess
```

firstPCFromScoresM :: Monad m => m [Vector Double] -> Vector Double -> m (Vector Double, Vector Double)Source

Calculate the first principal component -- calculating the samples fresh on every pass.

This function calculates the exact same results as `firstPCFromScores` (minus the residual), but instead of an input `Matrix`, it takes a monad action that yields the list of samples, and it guarantees that the list returned by the action will be consumed in a single pass. However the action may be demanded many times.

The residual can't be calculated lazily, like it is in `firstPCFromScores`, because the samples would need to be demanded. Instead, to calculate the residual use `residual`.

There is no corresponding `firstPCM` that guesses the initial score vector for you because if you need to use this function instead of `firstPC`, then you really should come up with a reasonable starting point or it will take forever.

Arguments

 :: [Vector Double] The samples -> Vector Double The component (also called the loading) -> Vector Double The scores -> [Vector Double] The residuals for each sample

Calculate the residuals of a series of samples given a component and score vector.

``` (p,t) <- firstPCFromScoresM samplesM (randomVector 0 Gaussian numSamples)
samples <- samplesM
let r = residual samples p t
```