This module performs support vector regression on a set of training points in order to determine the generating function. Currently least squares support vector regression is implemented. The optimal solution to the Langrangian is found by a conjugate gradient algorithm (CGA).
- data DataSet = DataSet {}
- data SVMSolution = SVMSolution {}
- newtype KernelFunction = KernelFunction ([Double] -> [Double] -> [Double] -> Double)
- class SVM a where
- createKernelMatrix :: a -> Array Int [Double] -> KernelMatrix
- dcost :: a -> Double
- evalKernel :: a -> [Double] -> [Double] -> Double
- simulate :: a -> SVMSolution -> Array Int [Double] -> [Double]
- solve :: a -> DataSet -> Double -> Int -> SVMSolution
- data LSSVM = LSSVM {}
- newtype KernelMatrix = KernelMatrix DoubleArray
- reciprocalKernelFunction :: [Double] -> [Double] -> [Double] -> Double
- radialKernelFunction :: [Double] -> [Double] -> [Double] -> Double
- linearKernelFunction :: [Double] -> [Double] -> [Double] -> Double
- splineKernelFunction :: [Double] -> [Double] -> [Double] -> Double
- polyKernelFunction :: [Double] -> [Double] -> [Double] -> Double
- mlpKernelFunction :: [Double] -> [Double] -> [Double] -> Double
Documentation
Each data set is a list of vectors and values which are training points of the form f(x) = y forall {x,y}.
data SVMSolution Source
The solution contains the dual weights, the support vectors and the bias.
newtype KernelFunction Source
Every kernel function represents an inner product in feature space. The parameters are:
- A list of kernel parameters that can be interpreted differently by each kernel function.
- The first point in the inner product.
- The second point in the inner product.
KernelFunction ([Double] -> [Double] -> [Double] -> Double) |
A support vector machine (SVM) can estimate a function based upon some training data. Instances of this class need only implement the dual cost and the kernel function. Default implementations are given for finding the SVM solution, for simulating a function and for creating a kernel matrix from a set of training points. All SVMs should return a solution which contains a list of the support vectors and their dual weigths. dcost represents the coefficient of the dual cost function. This term gets added to the diagonal elements of the kernel matrix and may be different for each type of SVM.
createKernelMatrix :: a -> Array Int [Double] -> KernelMatrixSource
Creates a KernelMatrix
from the training points in the DataSet
. If kf
is the
KernelFunction
then the elements of the kernel matrix are given by K[i,j] = kf x[i] x[j]
,
where the x[i]
are taken from the training points. The kernel matrix is symmetric and
positive semi-definite.Only the bottom half of the kernel matrix is stored.
The derivative of the cost function is added to the diagonal elements of the kernel matrix. This places a cost on the norm of the solution, which helps prevent overfitting of the training data.
evalKernel :: a -> [Double] -> [Double] -> DoubleSource
This function provides access to the KernelFunction
used by the SVM
.
simulate :: a -> SVMSolution -> Array Int [Double] -> [Double]Source
This function takes an SVMSolution
produced by the SVM
passed in, and a list of points
in the space, and it returns a list of valuues y = f(x), where f is the generating function
represented by the support vector solution.
solve :: a -> DataSet -> Double -> Int -> SVMSolutionSource
This function takes a DataSet
and feeds it to the SVM
. Then it returns the
SVMSolution
which is the support vector solution for the function which generated the
points in the training set. The function also takes values for epsilon and the max
iterations, which are used as stopping criteria in the conjugate gradient algorithm.
A least squares support vector machine. The cost represents the relative expense of missing a training versus a more complicated generating function. The higher this number the better the fit of the training set, but at a cost of poorer generalization. The LSSVM uses every training point in the solution and performs least squares regression on the dual of the problem.
LSSVM | |
|
newtype KernelMatrix Source
The kernel matrix has been implemented as an unboxed array for performance reasons.
KernelMatrix DoubleArray |
reciprocalKernelFunction :: [Double] -> [Double] -> [Double] -> DoubleSource
The reciprocal kernel is the result of exponential basis functions, exp(-k*(x+a)). The inner product is an integral over all k >= 0.
radialKernelFunction :: [Double] -> [Double] -> [Double] -> DoubleSource
This is the kernel when radial basis functions are used.