Jan Skibinski, Numeric Quest Inc., Huntsville, Ontario, Canada
1998.09.19, last modified 1998.12.28
It has been argued that the functional paradigm offers more support for scientific computing than the traditional imperative programming, such as greater similarity of functional implementation to mathematical specification of a problem. However, efficiency of scientific algorithms implemented in Haskell is very low compared to efficiencies of C or Fortran implementations - notwithstanding the exceptional descriptive power of Haskell.
With this in mind, we are attempting to demonstrate here that the indexing traps can be successfully avoided. This module implements afresh several typical problems from linear algebra. Standard Haskell lists are employed instead of arrays and not a single algorithm ever uses indices for lookups or updates.
Two major algorithms have been invented and implemented in Haskell: one for solving systems of linear equations and one for finding eigenvalues and eigenvectors of almost any type of a square matrix. This includes symmetric, hermitian, general complex or nonsymmetric matrices with real eigenvalues.
Contents
Notation
What follows is written in Dirac's notation, as used in Quantum Mechanics. Matrices are represented by capital letters, while vectors come in two varieties:
Bra vectors can be obtained from ket vectors by transposition and conjugation of their components. Conjugation is only important for complex vectors.
Scalar product of two vectors | x > and | y > is written as
< x | y >which looks like a bracket and is sometimes called a "bra_ket". This justifies "bra" and "ket" names introduced by Dirac. There is a good reason for conjugating the components of "bra-vector": the scalar product of
< x | x >should be a square of the norm of the vector "x", and that means that it should be represented by a real number, or complex number but with its imaginary part equal to zero.
Scalar product and vector normalization> module Orthogonals where > import Complex > import Ratio > import qualified List
The scalar product "bra_ket" is a basis of many algorithms presented here.
Notice the call to function "coupled" in the above implementation of scalar product. This function conjugates its argument if it is complex, otherwise does not change it. It is defined in the class Scalar - specifically designed for this purpose mainly.> bra_ket :: (Scalar a, Num a) => [a] -> [a] -> a > bra_ket u v = > -- > -- Scalar product of two vectors u and v, > -- or < u | v > in Dirac's notation. > -- This is equally valid for both: real and complex vectors. > -- > sum_product u (map coupled v)
But we also need a slightly different definition of scalar product that will appear in multiplication of matrices by vectors (or vice versa): a straightforward accumulated product of two lists, where no complex conjugation takes place. We will call it a 'sum_product".> class Scalar a where > coupled :: a->a > norm :: [a] -> a > almostZero :: a -> Bool > scaled :: [a] -> [a]> instance Scalar Double where > coupled x = x > norm u = sqrt (bra_ket u u) > almostZero x = (abs x) < 1.0e-8 > scaled = scaled'> instance Scalar Float where > coupled x = x > norm u = sqrt (bra_ket u u) > almostZero x = (abs x) < 1.0e-8 > scaled = scaled'> instance (Integral a) => Scalar (Ratio a) where > coupled x = x > -- norm u = fromDouble ((sqrt (bra_ket u u))::Double) > -- Intended hack to silently convert to and from Double. > -- But I do not know how to declare it properly. > -- > -- Our type Fraction, when used instead of Ratio a, has its own > -- definition of sqrt. No hack would be needed here. > almostZero x = abs x < 1e-8 > scaled = scaled'> instance (RealFloat a) => Scalar (Complex a) where > coupled (x:+y) = x:+(-y) > norm u = sqrt (realPart (bra_ket u u)) :+ 0 > almostZero z = (realPart (abs z)) < 1.0e-8 > scaled u = [(x/m):+(y/m) | x:+y <- u] > where m = maximum [max (abs x) (abs y) | x:+y <- u]> norm1 :: (Num a) => [a] -> a > norm1 = sum . map abs> norminf :: (Num a, Ord a) => [a] -> a > norminf = maximum . map abs> matnorm1 :: (Num a, Ord a) => [[a]] -> a > matnorm1 = matnorminf . transposed> matnorminf :: (Num a, Ord a) => [[a]] -> a > matnorminf = maximum . map norm1
Some algorithms might need vectors normalized to one, although we'll try to avoid the normalizations due to its high cost or its inapplicability to rational numbers. Instead, we wiil scale vectors by their maximal components.> sum_product :: Num a => [a] -> [a] -> a > sum_product u v = > -- > -- Similar to scalar product but without > -- conjugations of | u > components > -- Used in matrix-vector or vector-matrix products > -- > sum (zipWith (*) u v)
> normalized :: (Scalar a, Fractional a) => [a] -> [a] > normalized u = > [uk/n | uk <- u] > where > n = norm u> scaled' :: (Fractional t, Ord t) => [t] -> [t] > scaled' u = > [uk/um | uk <- u] > where > um = maximum [abs uk| uk <- u]
Transposition and adjoining of matrices
Matrices are represented here by lists of lists. Function "transposed" converts from row-wise to column-wise representation, or vice versa.
A square matrix is called symmetric if it is equal to its transpose
A = A^{T}It is called Hermitian, or self-adjoint, if it equals to its adjoint
A = A^{+}> transposed :: [[a]] -> [[a]] > transposed a > | null (head a) = [] > | otherwise = ([head mi| mi <- a]) > :transposed ([tail mi| mi <- a])> adjoint :: Scalar a => [[a]] -> [[a]] > adjoint a > | null (head a) = [] > | otherwise = ([coupled (head mi)| mi <- a]) > :adjoint ([tail mi| mi <- a])
Linear combination and sum of two matrices
One can form a linear combination of two matrices, such as
C = alpha A + beta B where alpha and beta are scalarsThe most generic form of any combination, not necessary linear, of components of two matrices is given by "matrix_zipWith" function below, which accepts a function "f" describing such combination. For the linear combination with two scalars the function "f" could be defined as:
f alpha beta a b = alpha*a + beta*bFor a straightforward addition of two matrices this auxiliary function is simply "(+)".
> matrix_zipWith :: (a -> b -> c) -> [[a]] -> [[b]] -> [[c]] > matrix_zipWith f a b = > -- > -- Matrix made of a combination > -- of matrices a and b - as specified by f > -- > [zipWith f ak bk | (ak,bk) <- zip a b]> add_matrices :: (Num a) => t -> t1 -> [[a]] -> [[a]] -> [[a]] > add_matrices _ _ = matrix_zipWith (+)
Products involving matrices
Variety of products involving matrices can be defined. Our Haskell implementation is based on lists of lists and therefore is open to interpretation: sublists can either represent the rows or the columns of a matrix.
C = A B
Inner product of two matrices A B can be expressed quite simply, providing that matrix A is represented by a list of rows and B - by a list of columns. Function "matrix_matrix" answers list of rows, while "matrix_matrix'" - list of columns.
| u > = A | v >> matrix_matrix :: Num a => [[a]] -> [[a]] -> [[a]] > matrix_matrix a b > -- > -- A matrix being an inner product > -- of matrices A and B, where > -- A is represented by a list of rows a > -- B is represented by a list of columns b > -- result is represented by list of rows > -- Require: length of a is equal of length of b > -- Require: all sublists are of equal length > > | null a = [] > | otherwise = ([sum_product (head a) bi | bi <- b]) > : matrix_matrix (tail a) b> matrix_matrix' :: (Num a) => [[a]] -> [[a]] -> [[a]] > matrix_matrix' a b > -- > -- Similar to "matrix_matrix" > -- but the result is represented by > -- a list of columns > -- > | null b = [] > | otherwise = ([sum_product ai (head b) | ai <- a]) > : matrix_matrix' a (tail b)> triangle_matrix' :: Num a => [[a]] -> [[a]] -> [[a]] > triangle_matrix' r q = > -- > -- List of columns of of a product of > -- upper triangular matrix R and square > -- matrix Q > -- where > -- r is a list of rows of R > -- q is a list of columns of A > -- > [f r qk | qk <- q] > where > f t u > | null t = [] > | otherwise = (sum_product (head t) u) > : (f (tail t) (tail u))
Product of a matrix and a ket-vector is another ket vector. The following implementation assumes that list "a" represents rows of matrix A.
< u | = < v | A> matrix_ket :: Num a => [[a]] -> [a] -> [a] > matrix_ket a v = [sum_product ai v| ai <- a]
Bra-vector multiplied by a matrix produces another bra-vector. The implementation below assumes that list "a" represents columns of matrix A. It is also assumed that vector "v" is given in its standard "ket" representation, therefore the definition below uses "bra_ket" instead of "sum_product".
alpha = < u | A | v >> bra_matrix :: (Scalar a, Num a) => [a] -> [[a]] -> [a] > bra_matrix v a = [bra_ket v ai | ai <- a]
This kind of product results in a scalar and is often used to define elements of a new matrix, such as
B[i,j] = < ei | A | ej >The implementation below assumes that list "a" represents rows of matrix A.
B = alpha A> bra_matrix_ket :: (Scalar a, Num a) => [a] -> [[a]] -> [a] -> a > bra_matrix_ket u a v = > bra_ket u (matrix_ket a v)
Below is a function which multiplies matrix by a scalar:
> scalar_matrix :: Num a => a -> [[a]] -> [[a]] > scalar_matrix alpha a = > [[alpha*aij| aij <- ai] | ai<-a]
Orthogonalization process
Gram-Schmidt orthogonalization procedure is used here for calculation of sets of mutually orthogonal vectors.
Function "gram_schmidt" computes one vector - orthogonal to an incomplete set of orthogonal vectors, which form a hyperplane in the vector space. Another words, "gram_schmidt" vector is perpendicular to such a hyperlane.
> orthogonals :: (Scalar a, Fractional a) => [a] -> [[a]] > orthogonals x = > -- > -- List of (n-1) linearly independent vectors, > -- (mutually orthogonal) and orthogonal to the > -- vector x, but not normalized, > -- where > -- n is a length of x. > -- > orth [x] size (next (-1)) > where > orth a n m > | n == 1 = drop 1 (reverse a) > | otherwise = orth ((gram_schmidt a u ):a) (n-1) (next m) > where > u = unit_vector m size > size = length x > next i = if (i+1) == k then (i+2) else (i+1) > k = length (takeWhile (== 0) x) -- first non-zero component of x> gram_schmidt :: (Scalar a, Fractional a) => [[a]] -> [a] -> [a] > gram_schmidt a u = > -- > -- Projection of vector | u > on some direction > -- orthogonal to the hyperplane spanned by the list 'a' > -- of mutually orthogonal (linearly independent) > -- vectors. > -- > gram_schmidt' a u u > where > gram_schmidt' [] _ w = w > gram_schmidt' (b:bs) v w > | all (== 0) b = gram_schmidt' bs v w > | otherwise = gram_schmidt' bs v w' > where > w' = vectorCombination w (-(bra_ket b v)/(bra_ket b b)) b > vectorCombination x c y > | null x = [] > | null y = [] > | otherwise = (head x + c * (head y)) > : (vectorCombination (tail x) c (tail y))
Solutions of linear equations by orthogonalization
A matrix equation for unknown vector | x >
A | x > = | b >can be rewritten as
x1 | 1 > + x2 | 2 > + x3 | 3 > + ... + xn | n > = | b > (7.1) where | 1 >, | 2 >... represent columns of the matrix AFor any n-dimensional vector, such as "1", there exist n-1 linearly independent vectors "ck" that are orthogonal to "1"; that is, each satisfies the relation:
< ck | 1 > = 0, for k = 1...m, where m = n - 1If we could find all such vectors, then we could multiply the equation (7.1) by each of them, and end up with m = n-1 following equations
< c1 | 2 > x2 + < c1 | 3 > x3 + ... < c1 | n > xn = < c1 | b > < c2 | 2 > x2 + < c2 | 3 > x3 + ... < c2 | n > xn = < c2 | b > ....... < cm | 2 > x2 + < cm | 3 > x3 + ... < cm | n > xn = < cm | b >But the above is nothing more than a new matrix equation
A' | x' > = | b' > or x2 | 2'> + x3 | 3'> .... + xn | n'> = | b'> where primed vectors | 2' >, etc. are the columns of the new matrix A'.with the problem dimension reduced by one.
x1 | 1 > + x2 | 2 > + x3 | 3 > + x4 | 4 > = | b > x2 | 2'> + x3 | 3'> + x4 | 4'> = | b'> x3 | 3''> + x4 | 4''> = | b''> x4 | 4'''> = | b'''>But if we premultiply each vector equation by a non-zero vector of our choice, say < 1 | , < 2' |, < 3'' |, and < 4''' | - chosen correspondingly for equations 1, 2, 3 and 4, then the above system of vector equations will be converted to much simpler system of scalar equations. The result is shown below in matrix representation:
| p11 p12 p13 p14 | | x1 | = | q1 | | 0 p22 p23 p24 | | x2 | = | q2 | | 0 0 p33 p34 | | x3 | = | q3 | | 0 0 0 p44 | | x4 | = | q4 |In effect, we have triangularized our original matrix A. Below is a function that does that for any problem size:
The triangular system of equations can be easily solved by successive substitutions - starting with the last equation.> one_ket_triangle :: (Scalar a, Fractional a) => [[a]] -> [a] -> [([a],a)] > one_ket_triangle a b > -- > -- List of pairs: (p, q) representing > -- rows of triangular matrix P and of vector | q > > -- in the equation P | x > = | q >, which > -- has been obtained by linear transformation > -- of the original equation A | x > = | b > > -- > | null a = [] > | otherwise = (p,q):(one_ket_triangle a' b') > where > p = [bra_ket u ak | ak <- a] > q = bra_ket u b > a' = [[bra_ket ck ai | ck <- orth] | ai <- v] > b' = [ bra_ket ck b | ck <- orth] > orth = orthogonals u > u = head a > v = tail a
The triangularization procedure can be easily extended to a list of several ket-vectors | b > on the right hand side of the original equation A | x > = | b > -- instead of just one:> one_ket_solution :: (Fractional a, Scalar a) => [[a]] -> [a] -> [a] > one_ket_solution a b = > -- > -- List representing vector |x>, which is > -- a solution of the matrix equation > -- A |x> = |b> > -- where > -- a is a list of columns of matrix A > -- b is a list representing vector |b> > -- > solve' (unzip (reverse (one_ket_triangle a b))) [] > where > solve' (d, c) xs > | null d = xs > | otherwise = solve' ((tail d), (tail c)) (x:xs) > where > x = (head c - (sum_product (tail u) xs))/(head u) > u = head d
Similarly, function 'one_ket_solution' can be generalized to function 'many_kets_solution' that handles cases with several ket-vectors on the right hand side.> many_kets_triangle :: (Scalar a, Fractional a) => [[a]] -> [[a]] -> [([a],[a])] > many_kets_triangle a b > -- > -- List of pairs: (p, q) representing > -- rows of triangular matrix P and of rectangular matrix Q > -- in the equation P X = Q, which > -- has been obtained by linear transformation > -- of the original equation A X = B > -- where > -- a is a list of columns of matrix A > -- b is a list of columns of matrix B > -- > | null a = [] > | otherwise = (p,q):(many_kets_triangle a' b') > where > p = [bra_ket u ak | ak <- a] > q = [bra_ket u bk | bk <- b] > a' = [[bra_ket ck ai | ck <- orth] | ai <- v] > b' = [[bra_ket ck bi | ck <- orth] | bi <- b] > orth = orthogonals u > u = head a > v = tail a
> many_kets_solution :: (Scalar a, Fractional a) => [[a]] -> [[a]] -> [[a]] > many_kets_solution a b = > -- > -- List of columns of matrix X, which is > -- a solution of the matrix equation > -- A X = B > -- where > -- a is a list of columns of matrix A > -- b is a list of columns of matrix B > -- > solve' p q emptyLists > where > (p, q) = unzip (reverse (many_kets_triangle a b)) > emptyLists = [[] | _ <- [1..(length (head q))]] > solve' a' b' x > | null a' = x > | otherwise = solve' (tail a') (tail b') > [(f vk xk):xk | (xk, vk) <- (zip x v)] > where > f vk xk = (vk - (sum_product (tail u) xk))/(head u) > u = head a' > v = head b'
Matrix inversion
Function 'many_kets_solution' can be used to compute inverse of matrix A by specializing matrix B to a unit matrix I:
A X = IIt follows that matrix X is an inverse of A; that is X = A^{-1}.
> inverse :: (Scalar a, Fractional a) => [[a]] -> [[a]] > inverse a = many_kets_solution a (unit_matrix (length a)) > -- > -- List of columns of inverse of matrix A > -- where > -- a is list of columns of A
QR factorization of matrices
The process described above and implemented by 'many_kets_triangle' function transforms the equation
A X = Binto another equation for the same matrix X
R X = Swhere R is an upper triangular matrix. All operations performed on matrices A and B during this process are linear, and therefore we should be able to find a square matrix Q that describes the entire process in one step. Indeed, assuming that matrix A can be decomposed as a product of unknown matrix Q and triangular matrix R and that Q^{-1} is an inverse of matrix Q we can reach the last equation by following these steps:
A X = B (Q R) X = B Q^{-1} Q R X = Q^{-1} B R X = SIt follows that during this process a given matrix B transforms to matrix S, as delivered by 'many_kets_triangle':
S = Q^{-1} Bfrom which the inverse of Q can be found:
Q^{-1} = S B^{-1}Having a freedom of choice of the right hand side matrix B we can choose the unit matrix I in place of B, and therefore simplify the definition of Q^{-1}:
Q^{-1} = S, if B is unit matrixIt follows that any non-singular matrix A can be decomposed as a product of a matrix Q and a triangular matrix R
A = Q Rwhere matrices Q^{-1} and R are delivered by "many_kets_triangle" as a result of triangularization process of equation:
A X = IThe function below extracts a pair of matrices Q and R from the answer provided by "many_kets_triangle". During this process it inverts matrix Q^{-1} to Q. This factorization will be used by a sequence of similarity transformations to be defined in the next section.
> factors_QR :: (Fractional a, Scalar a) => [[a]] -> ([[a]],[[a]]) > factors_QR a = > -- > -- A pair of matrices (Q, R), such that > -- A = Q R > -- where > -- R is upper triangular matrix in row representation > -- (without redundant zeros) > -- Q is a transformation matrix in column representation > -- A is square matrix given as columns > -- > (inverse (transposed q1),r) > where > (r, q1) = unzip (many_kets_triangle a (unit_matrix (length a)))
Computation of the determinant
Naive division-free computation of the determinant by expanding the first column. It consumes n! multiplications.> determinant :: (Fractional a, Scalar a) => [[a]] -> a > determinant a = > let (q,r) = factors_QR a > -- matrix Q is not normed so we have to respect the norms of its rows > in product (map norm q) * product (map head r)
> determinantNaive :: (Num a) => [[a]] -> a > determinantNaive [] = 1 > determinantNaive m = > sum (alternate > (zipWith (*) (map head m) > (map determinantNaive (removeEach (