Jan Skibinski, Numeric Quest Inc., Huntsville, Ontario, Canada
1998.09.19, last modified 1998.12.28
It has been argued that the functional paradigm offers more support for scientific computing than the traditional imperative programming, such as greater similarity of functional implementation to mathematical specification of a problem. However, efficiency of scientific algorithms implemented in Haskell is very low compared to efficiencies of C or Fortran implementations - notwithstanding the exceptional descriptive power of Haskell.
With this in mind, we are attempting to demonstrate here that the indexing traps can be successfully avoided. This module implements afresh several typical problems from linear algebra. Standard Haskell lists are employed instead of arrays and not a single algorithm ever uses indices for lookups or updates.
Two major algorithms have been invented and implemented in Haskell: one for solving systems of linear equations and one for finding eigenvalues and eigenvectors of almost any type of a square matrix. This includes symmetric, hermitian, general complex or nonsymmetric matrices with real eigenvalues.
Contents
Notation
What follows is written in Dirac's notation, as used in Quantum Mechanics. Matrices are represented by capital letters, while vectors come in two varieties:
Bra vectors can be obtained from ket vectors by transposition and conjugation of their components. Conjugation is only important for complex vectors.
Scalar product of two vectors | x > and | y > is written as
< x | y >which looks like a bracket and is sometimes called a "bra_ket". This justifies "bra" and "ket" names introduced by Dirac. There is a good reason for conjugating the components of "bra-vector": the scalar product of
< x | x >should be a square of the norm of the vector "x", and that means that it should be represented by a real number, or complex number but with its imaginary part equal to zero.
Scalar product and vector normalization> module Orthogonals where > import Data.Complex > import Data.Ratio > import qualified Data.List as List
The scalar product "bra_ket" is a basis of many algorithms presented here.
Notice the call to function "coupled" in the above implementation of scalar product. This function conjugates its argument if it is complex, otherwise does not change it. It is defined in the class Scalar - specifically designed for this purpose mainly.> bra_ket :: (Scalar a, Num a) => [a] -> [a] -> a > bra_ket u v = > -- > -- Scalar product of two vectors u and v, > -- or < u | v > in Dirac's notation. > -- This is equally valid for both: real and complex vectors. > -- > sum_product u (map coupled v)
But we also need a slightly different definition of scalar product that will appear in multiplication of matrices by vectors (or vice versa): a straightforward accumulated product of two lists, where no complex conjugation takes place. We will call it a 'sum_product".> class Scalar a where > coupled :: a->a > norm :: [a] -> a > almostZero :: a -> Bool > scaled :: [a] -> [a]> instance Scalar Double where > coupled x = x > norm u = sqrt (bra_ket u u) > almostZero x = (abs x) < 1.0e-8 > scaled = scaled'> instance Scalar Float where > coupled x = x > norm u = sqrt (bra_ket u u) > almostZero x = (abs x) < 1.0e-8 > scaled = scaled'> instance (Integral a) => Scalar (Ratio a) where > coupled x = x > -- norm u = fromDouble ((sqrt (bra_ket u u))::Double) > -- Intended hack to silently convert to and from Double. > -- But I do not know how to declare it properly. > -- > -- Our type Fraction, when used instead of Ratio a, has its own > -- definition of sqrt. No hack would be needed here. > almostZero x = abs x < 1e-8 > scaled = scaled'> instance (RealFloat a) => Scalar (Complex a) where > coupled (x:+y) = x:+(-y) > norm u = sqrt (realPart (bra_ket u u)) :+ 0 > almostZero z = (realPart (abs z)) < 1.0e-8 > scaled u = [(x/m):+(y/m) | x:+y <- u] > where m = maximum [max (abs x) (abs y) | x:+y <- u]> norm1 :: (Num a) => [a] -> a > norm1 = sum . map abs> norminf :: (Num a, Ord a) => [a] -> a > norminf = maximum . map abs> matnorm1 :: (Num a, Ord a) => [[a]] -> a > matnorm1 = matnorminf . transposed> matnorminf :: (Num a, Ord a) => [[a]] -> a > matnorminf = maximum . map norm1
Some algorithms might need vectors normalized to one, although we'll try to avoid the normalizations due to its high cost or its inapplicability to rational numbers. Instead, we wiil scale vectors by their maximal components.> sum_product :: Num a => [a] -> [a] -> a > sum_product u v = > -- > -- Similar to scalar product but without > -- conjugations of | u > components > -- Used in matrix-vector or vector-matrix products > -- > sum (zipWith (*) u v)
> normalized :: (Scalar a, Fractional a) => [a] -> [a] > normalized u = > [uk/n | uk <- u] > where > n = norm u> scaled' :: (Fractional t, Ord t) => [t] -> [t] > scaled' u = > [uk/um | uk <- u] > where > um = maximum [abs uk| uk <- u]
Transposition and adjoining of matrices
Matrices are represented here by lists of lists. Function "transposed" converts from row-wise to column-wise representation, or vice versa.
A square matrix is called symmetric if it is equal to its transpose
A = A^{T}It is called Hermitian, or self-adjoint, if it equals to its adjoint
A = A^{+}> transposed :: [[a]] -> [[a]] > transposed a > | null (head a) = [] > | otherwise = ([head mi| mi <- a]) > :transposed ([tail mi| mi <- a])> adjoint :: Scalar a => [[a]] -> [[a]] > adjoint a > | null (head a) = [] > | otherwise = ([coupled (head mi)| mi <- a]) > :adjoint ([tail mi| mi <- a])
Linear combination and sum of two matrices
One can form a linear combination of two matrices, such as
C = alpha A + beta B where alpha and beta are scalarsThe most generic form of any combination, not necessary linear, of components of two matrices is given by "matrix_zipWith" function below, which accepts a function "f" describing such combination. For the linear combination with two scalars the function "f" could be defined as:
f alpha beta a b = alpha*a + beta*bFor a straightforward addition of two matrices this auxiliary function is simply "(+)".
> matrix_zipWith :: (a -> b -> c) -> [[a]] -> [[b]] -> [[c]] > matrix_zipWith f a b = > -- > -- Matrix made of a combination > -- of matrices a and b - as specified by f > -- > [zipWith f ak bk | (ak,bk) <- zip a b]> add_matrices :: (Num a) => t -> t1 -> [[a]] -> [[a]] -> [[a]] > add_matrices _ _ = matrix_zipWith (+)
Products involving matrices
Variety of products involving matrices can be defined. Our Haskell implementation is based on lists of lists and therefore is open to interpretation: sublists can either represent the rows or the columns of a matrix.
C = A B
Inner product of two matrices A B can be expressed quite simply, providing that matrix A is represented by a list of rows and B - by a list of columns. Function "matrix_matrix" answers list of rows, while "matrix_matrix'" - list of columns.
| u > = A | v >> matrix_matrix :: Num a => [[a]] -> [[a]] -> [[a]] > matrix_matrix a b > -- > -- A matrix being an inner product > -- of matrices A and B, where > -- A is represented by a list of rows a > -- B is represented by a list of columns b > -- result is represented by list of rows > -- Require: length of a is equal of length of b > -- Require: all sublists are of equal length > > | null a = [] > | otherwise = ([sum_product (head a) bi | bi <- b]) > : matrix_matrix (tail a) b> matrix_matrix' :: (Num a) => [[a]] -> [[a]] -> [[a]] > matrix_matrix' a b > -- > -- Similar to "matrix_matrix" > -- but the result is represented by > -- a list of columns > -- > | null b = [] > | otherwise = ([sum_product ai (head b) | ai <- a]) > : matrix_matrix' a (tail b)> triangle_matrix' :: Num a => [[a]] -> [[a]] -> [[a]] > triangle_matrix' r q = > -- > -- List of columns of of a product of > -- upper triangular matrix R and square > -- matrix Q > -- where > -- r is a list of rows of R > -- q is a list of columns of A > -- > [f r qk | qk <- q] > where > f t u > | null t = [] > | otherwise = (sum_product (head t) u) > : (f (tail t) (tail u))
Product of a matrix and a ket-vector is another ket vector. The following implementation assumes that list "a" represents rows of matrix A.
< u | = < v | A> matrix_ket :: Num a => [[a]] -> [a] -> [a] > matrix_ket a v = [sum_product ai v| ai <- a]
Bra-vector multiplied by a matrix produces another bra-vector. The implementation below assumes that list "a" represents columns of matrix A. It is also assumed that vector "v" is given in its standard "ket" representation, therefore the definition below uses "bra_ket" instead of "sum_product".
alpha = < u | A | v >> bra_matrix :: (Scalar a, Num a) => [a] -> [[a]] -> [a] > bra_matrix v a = [bra_ket v ai | ai <- a]
This kind of product results in a scalar and is often used to define elements of a new matrix, such as
B[i,j] = < ei | A | ej >The implementation below assumes that list "a" represents rows of matrix A.
B = alpha A> bra_matrix_ket :: (Scalar a, Num a) => [a] -> [[a]] -> [a] -> a > bra_matrix_ket u a v = > bra_ket u (matrix_ket a v)
Below is a function which multiplies matrix by a scalar:
> scalar_matrix :: Num a => a -> [[a]] -> [[a]] > scalar_matrix alpha a = > [[alpha*aij| aij <- ai] | ai<-a]
Orthogonalization process
Gram-Schmidt orthogonalization procedure is used here for calculation of sets of mutually orthogonal vectors.
Function "gram_schmidt" computes one vector - orthogonal to an incomplete set of orthogonal vectors, which form a hyperplane in the vector space. Another words, "gram_schmidt" vector is perpendicular to such a hyperlane.
> orthogonals :: (Scalar a, Fractional a) => [a] -> [[a]] > orthogonals x = > -- > -- List of (n-1) linearly independent vectors, > -- (mutually orthogonal) and orthogonal to the > -- vector x, but not normalized, > -- where > -- n is a length of x. > -- > orth [x] size (next (-1)) > where > orth a n m > | n == 1 = drop 1 (reverse a) > | otherwise = orth ((gram_schmidt a u ):a) (n-1) (next m) > where > u = unit_vector m size > size = length x > next i = if (i+1) == k then (i+2) else (i+1) > k = length (takeWhile (== 0) x) -- first non-zero component of x> gram_schmidt :: (Scalar a, Fractional a) => [[a]] -> [a] -> [a] > gram_schmidt a u = > -- > -- Projection of vector | u > on some direction > -- orthogonal to the hyperplane spanned by the list 'a' > -- of mutually orthogonal (linearly independent) > -- vectors. > -- > gram_schmidt' a u u > where > gram_schmidt' [] _ w = w > gram_schmidt' (b:bs) v w > | all (== 0) b = gram_schmidt' bs v w > | otherwise = gram_schmidt' bs v w' > where > w' = vectorCombination w (-(bra_ket b v)/(bra_ket b b)) b > vectorCombination x c y > | null x = [] > | null y = [] > | otherwise = (head x + c * (head y)) > : (vectorCombination (tail x) c (tail y))
Solutions of linear equations by orthogonalization
A matrix equation for unknown vector | x >
A | x > = | b >can be rewritten as
x1 | 1 > + x2 | 2 > + x3 | 3 > + ... + xn | n > = | b > (7.1) where | 1 >, | 2 >... represent columns of the matrix AFor any n-dimensional vector, such as "1", there exist n-1 linearly independent vectors "ck" that are orthogonal to "1"; that is, each satisfies the relation:
< ck | 1 > = 0, for k = 1...m, where m = n - 1If we could find all such vectors, then we could multiply the equation (7.1) by each of them, and end up with m = n-1 following equations
< c1 | 2 > x2 + < c1 | 3 > x3 + ... < c1 | n > xn = < c1 | b > < c2 | 2 > x2 + < c2 | 3 > x3 + ... < c2 | n > xn = < c2 | b > ....... < cm | 2 > x2 + < cm | 3 > x3 + ... < cm | n > xn = < cm | b >But the above is nothing more than a new matrix equation
A' | x' > = | b' > or x2 | 2'> + x3 | 3'> .... + xn | n'> = | b'> where primed vectors | 2' >, etc. are the columns of the new matrix A'.with the problem dimension reduced by one.
x1 | 1 > + x2 | 2 > + x3 | 3 > + x4 | 4 > = | b > x2 | 2'> + x3 | 3'> + x4 | 4'> = | b'> x3 | 3''> + x4 | 4''> = | b''> x4 | 4'''> = | b'''>But if we premultiply each vector equation by a non-zero vector of our choice, say < 1 | , < 2' |, < 3'' |, and < 4''' | - chosen correspondingly for equations 1, 2, 3 and 4, then the above system of vector equations will be converted to much simpler system of scalar equations. The result is shown below in matrix representation:
| p11 p12 p13 p14 | | x1 | = | q1 | | 0 p22 p23 p24 | | x2 | = | q2 | | 0 0 p33 p34 | | x3 | = | q3 | | 0 0 0 p44 | | x4 | = | q4 |In effect, we have triangularized our original matrix A. Below is a function that does that for any problem size:
The triangular system of equations can be easily solved by successive substitutions - starting with the last equation.> one_ket_triangle :: (Scalar a, Fractional a) => [[a]] -> [a] -> [([a],a)] > one_ket_triangle a b > -- > -- List of pairs: (p, q) representing > -- rows of triangular matrix P and of vector | q > > -- in the equation P | x > = | q >, which > -- has been obtained by linear transformation > -- of the original equation A | x > = | b > > -- > | null a = [] > | otherwise = (p,q):(one_ket_triangle a' b') > where > p = [bra_ket u ak | ak <- a] > q = bra_ket u b > a' = [[bra_ket ck ai | ck <- orth] | ai <- v] > b' = [ bra_ket ck b | ck <- orth] > orth = orthogonals u > u = head a > v = tail a
The triangularization procedure can be easily extended to a list of several ket-vectors | b > on the right hand side of the original equation A | x > = | b > -- instead of just one:> one_ket_solution :: (Fractional a, Scalar a) => [[a]] -> [a] -> [a] > one_ket_solution a b = > -- > -- List representing vector |x>, which is > -- a solution of the matrix equation > -- A |x> = |b> > -- where > -- a is a list of columns of matrix A > -- b is a list representing vector |b> > -- > solve' (unzip (reverse (one_ket_triangle a b))) [] > where > solve' (d, c) xs > | null d = xs > | otherwise = solve' ((tail d), (tail c)) (x:xs) > where > x = (head c - (sum_product (tail u) xs))/(head u) > u = head d
Similarly, function 'one_ket_solution' can be generalized to function 'many_kets_solution' that handles cases with several ket-vectors on the right hand side.> many_kets_triangle :: (Scalar a, Fractional a) => [[a]] -> [[a]] -> [([a],[a])] > many_kets_triangle a b > -- > -- List of pairs: (p, q) representing > -- rows of triangular matrix P and of rectangular matrix Q > -- in the equation P X = Q, which > -- has been obtained by linear transformation > -- of the original equation A X = B > -- where > -- a is a list of columns of matrix A > -- b is a list of columns of matrix B > -- > | null a = [] > | otherwise = (p,q):(many_kets_triangle a' b') > where > p = [bra_ket u ak | ak <- a] > q = [bra_ket u bk | bk <- b] > a' = [[bra_ket ck ai | ck <- orth] | ai <- v] > b' = [[bra_ket ck bi | ck <- orth] | bi <- b] > orth = orthogonals u > u = head a > v = tail a
> many_kets_solution :: (Scalar a, Fractional a) => [[a]] -> [[a]] -> [[a]] > many_kets_solution a b = > -- > -- List of columns of matrix X, which is > -- a solution of the matrix equation > -- A X = B > -- where > -- a is a list of columns of matrix A > -- b is a list of columns of matrix B > -- > solve' p q emptyLists > where > (p, q) = unzip (reverse (many_kets_triangle a b)) > emptyLists = [[] | _ <- [1..(length (head q))]] > solve' a' b' x > | null a' = x > | otherwise = solve' (tail a') (tail b') > [(f vk xk):xk | (xk, vk) <- (zip x v)] > where > f vk xk = (vk - (sum_product (tail u) xk))/(head u) > u = head a' > v = head b'
Matrix inversion
Function 'many_kets_solution' can be used to compute inverse of matrix A by specializing matrix B to a unit matrix I:
A X = IIt follows that matrix X is an inverse of A; that is X = A^{-1}.
> inverse :: (Scalar a, Fractional a) => [[a]] -> [[a]] > inverse a = many_kets_solution a (unit_matrix (length a)) > -- > -- List of columns of inverse of matrix A > -- where > -- a is list of columns of A
QR factorization of matrices
The process described above and implemented by 'many_kets_triangle' function transforms the equation
A X = Binto another equation for the same matrix X
R X = Swhere R is an upper triangular matrix. All operations performed on matrices A and B during this process are linear, and therefore we should be able to find a square matrix Q that describes the entire process in one step. Indeed, assuming that matrix A can be decomposed as a product of unknown matrix Q and triangular matrix R and that Q^{-1} is an inverse of matrix Q we can reach the last equation by following these steps:
A X = B (Q R) X = B Q^{-1} Q R X = Q^{-1} B R X = SIt follows that during this process a given matrix B transforms to matrix S, as delivered by 'many_kets_triangle':
S = Q^{-1} Bfrom which the inverse of Q can be found:
Q^{-1} = S B^{-1}Having a freedom of choice of the right hand side matrix B we can choose the unit matrix I in place of B, and therefore simplify the definition of Q^{-1}:
Q^{-1} = S, if B is unit matrixIt follows that any non-singular matrix A can be decomposed as a product of a matrix Q and a triangular matrix R
A = Q Rwhere matrices Q^{-1} and R are delivered by "many_kets_triangle" as a result of triangularization process of equation:
A X = IThe function below extracts a pair of matrices Q and R from the answer provided by "many_kets_triangle". During this process it inverts matrix Q^{-1} to Q. This factorization will be used by a sequence of similarity transformations to be defined in the next section.
> factors_QR :: (Fractional a, Scalar a) => [[a]] -> ([[a]],[[a]]) > factors_QR a = > -- > -- A pair of matrices (Q, R), such that > -- A = Q R > -- where > -- R is upper triangular matrix in row representation > -- (without redundant zeros) > -- Q is a transformation matrix in column representation > -- A is square matrix given as columns > -- > (inverse (transposed q1),r) > where > (r, q1) = unzip (many_kets_triangle a (unit_matrix (length a)))
Computation of the determinant
Naive division-free computation of the determinant by expanding the first column. It consumes n! multiplications.> determinant :: (Fractional a, Scalar a) => [[a]] -> a > determinant a = > let (q,r) = factors_QR a > -- matrix Q is not normed so we have to respect the norms of its rows > in product (map norm q) * product (map head r)
Compute the determinant with about n^4 multiplications without division according to the clow decomposition algorithm of Mahajan and Vinay, and Berkowitz as presented by GÃ¼nter Rote: Division-Free Algorithms for the Determinant and the Pfaffian: Algebraic and Combinatorial Approaches.> determinantNaive :: (Num a) => [[a]] -> a > determinantNaive [] = 1 > determinantNaive m = > sum (alternate > (zipWith (*) (map head m) > (map determinantNaive (removeEach (map tail m)))))
Compute the weights of all clow sequences where the last clow is closed and a new one is started.> determinantClow :: (Num a) => [[a]] -> a > determinantClow [] = 1 > determinantClow m = > let lm = length m > in parityFlip lm (last (newClow m > (nest (lm-1) (longerClow m) > (take lm (iterate (0:) [1])))))
Compute the weights of all clow sequences where the last (open) clow is extended by a new arc.> newClow :: (Num a) => [[a]] -> [[a]] -> [a] > newClow a c = > scanl (-) 0 > (sumVec (zipWith (zipWith (*)) (List.transpose a) c))
Given the matrix of all weights of clows of length l compute the weight matrix for all clows of length (l+1). Take the result of 'newClow' as diagonal and the result of 'extendClow' as lower triangle of the weight matrix.> extendClow :: (Num a) => [[a]] -> [[a]] -> [[a]] > extendClow a c = > map (\ai -> sumVec (zipWith scaleVec ai c)) a
Auxiliary functions for the clow determinant.> longerClow :: (Num a) => [[a]] -> [[a]] -> [[a]] > longerClow a c = > let diagonal = newClow a c > triangle = extendClow a c > in zipWith3 (\i t d -> take i t ++ [d]) [0 ..] triangle diagonal
> {- | Compositional power of a function, > i.e. apply the function n times to a value. -} > nest :: Int -> (a -> a) -> a -> a > nest 0 _ x = x > nest n f x = f (nest (n-1) f x) > > {- successively select elements from xs and remove one in each result list -} > removeEach :: [a] -> [[a]] > removeEach xs = > zipWith (++) (List.inits xs) (tail (List.tails xs)) > > alternate :: (Num a) => [a] -> [a] > alternate = zipWith id (cycle [id, negate]) > > parityFlip :: Num a => Int -> a -> a > parityFlip n x = if even n then x else -x > > {-| Weight a list of numbers by a scalar. -} > scaleVec :: (Num a) => a -> [a] -> [a] > scaleVec k = map (k*) > > {-| Add corresponding numbers of two lists. -} > {- don't use zipWith because it clips to the shorter list -} > addVec :: (Num a) => [a] -> [a] -> [a] > addVec x [] = x > addVec [] y = y > addVec (x:xs) (y:ys) = x+y : addVec xs ys > > {-| Add some lists. -} > sumVec :: (Num a) => [[a]] -> [a] > sumVec = foldl addVec []
Similarity transformations and eigenvalues
Two n-square matrices A and B are called similar if there exists a non-singular matrix S such that:
B = S^{-1} A SIt can be proven that:
If matrix A can be transformed to a triangular or a diagonal matrix Ak by a sequence of similarity transformations then the eigenvalues of matrix A are the diagonal elements of Ak.
Let's construct the sequence of matrices similar to A
A, A1, A2, A3...by the following iterations - each of which factorizes a matrix by applying the function 'factors_QR' and then forms a product of the factors taken in the reverse order:
A = Q R = Q (R Q) Q^{-1} = Q A1 Q^{-1} = = Q (Q1 R1) Q^{-1} = Q Q1 (R1 Q1) Q1^{-1} Q^{-1} = Q Q1 A2 Q1^{-1} Q^{-1} = = Q Q1 (Q2 R2) Q1^{-1} Q^{-1} = ...We are hoping that after some number of iterations some matrix Ak would become triangular and therefore its diagonal elements could serve as eigenvalues of matrix A. As long as a matrix has real eigenvalues only, this method should work well. This applies to symmetric and hermitian matrices. It appears that general complex matrices -- hermitian or not -- can also be handled this way. Even more, this method also works for some nonsymmetric real matrices, which have real eigenvalues only.
> similar_to :: (Fractional a, Scalar a) => [[a]] -> [[a]] > similar_to a = > -- > -- List of columns of matrix A1 similar to A > -- obtained by factoring A as Q R and then > -- forming the product A1 = R Q = (inverse Q) A Q > -- where > -- a is list of columns of A > -- > triangle_matrix' r q > where > (q,r) = factors_QR a> iterated_eigenvalues :: (Scalar a1, Fractional a1, Num a) => [[a1]] -> a -> [[a1]] > iterated_eigenvalues a n > -- > -- List of vectors representing > -- successive approximations of > -- eigenvalues of matrix A > -- where > -- a is a list of columns of A > -- n is a number of requested iterations > -- > | n == 0 = [] > | otherwise = (diagonals a) > : iterated_eigenvalues (similar_to a) (n-1)> eigenvalues :: (Scalar a1, Fractional a1, Num a) => [[a1]] -> a -> [a1] > eigenvalues a n > -- > -- Eigenvalues of matrix A > -- obtained by n similarity iterations > -- where > -- a are the columns of A > -- > | n == 0 = diagonals a > | otherwise = eigenvalues (similar_to a) (n-1)
Preconditioning of real nonsymmetric matrices
As mentioned above, our QR-like factorization method works well with almost all kind of matrices, but with the exception of a class of real nonsymmetric matrices that have complex eigenvalues.
Consider the eigenproblem for real and nonsymmetric matrix A.
A | x > = a | x >Let us now define a new complex matrix B, such that:
B = A + alpha I where I is a unit matrix and alpha is a complex scalarIt is obvious that matrices A and B commute; that is:
A B = B AIt can be proven that if two matrices commute then they have the same eigenvectors. Therefore we can use vector | x > of matrix A as an eigenvector of B:
B | x > = b | x > B | x > = A | x > + alpha I | x > = a | x > + alpha | x > = (a + alpha) | x >It follows that eigenvalues of B are related to the eigenvalues of A by:
b = a + alphaAfter eigenvalues of complex matrix B have been succesfully computed, all what remains is to subtract "alpha" from them all to obtain eigenvalues of A. And nothing has to be done to eigenvectors of B - they are the same for A as well. Simple and elegant!
Below is an auxiliary function that adds a scalar to the diagonal of a matrix:
> add_to_diagonal :: Num a => a -> [[a]] -> [[a]] > add_to_diagonal alpha a = > -- > -- Add constant alpha to diagonal of matrix A > -- > [f ai ni | (ai,ni) <- zip a [0..(length a -1)]] > where > f b k = p++[head q + alpha]++(tail q) > where > (p,q) = splitAt k b >
Examples of iterated eigenvalues
Here is an example of a symmetric real matrix with results of application of function 'iterated_eigenvalues'.
| 7 -2 1 | |-2 10 -2 | | 1 -2 7 | [[7.0, 10.0, 7.0], [8.66667, 9.05752, 6.27582], [10.7928, 7.11006, 6.09718], [11.5513, 6.40499, 6.04367], [11.7889, 6.18968, 6.02142], [11.8943, 6.09506, 6.01068], [11.9468, 6.04788, 6.00534], [11.9733, 6.02405, 6.00267], [11.9866, 6.01206, 6.00134], [11.9933, 6.00604, 6.00067], [11.9966, 6.00302, 6.00034], [11.9983, 6.00151, 6.00017], [11.9992, 6.00076, 6.00008], [11.9996, 6.00038, 6.00004], [11.9998, 6.00019, 6.00002], [11.9999, 6.00010, 6.00001], [11.9999, 6.00005, 6.00001]] The true eigenvalues are: 12, 6, 6Here is an example of a hermitian matrix. (Eigenvalues of hermitian matrices are real.) The algorithm works well and converges fast.
| 2 0 i| [ 0 1 0 | [ -i 0 2 | [[2.8 :+ 0.0, 1.0 :+ 0.0, 1.2 :+ 0.0], [2.93979 :+ 0.0, 1.0 :+ 0.0, 1.06021 :+ 0.0], [2.97972 :+ 0.0, 1.0 :+ 0.0, 1.02028 :+ 0.0], [2.9932 :+ 0.0, 1.0 :+ 0.0, 1.0068 :+ 0.0], [2.99773 :+ 0.0, 1.0 :+ 0.0, 1.00227 :+ 0.0], [2.99924 :+ 0.0, 1.0 :+ 0.0, 1.00076 :+ 0.0], [2.99975 :+ 0.0, 1.0 :+ 0.0, 1.00025 :+ 0.0], [2.99992 :+ 0.0, 1.0 :+ 0.0, 1.00008 :+ 0.0], [2.99997 :+ 0.0, 1.0 :+ 0.0, 1.00003 :+ 0.0], [2.99999 :+ 0.0, 1.0 :+ 0.0, 1.00001 :+ 0.0], [3.0 :+ 0.0, 1.0 :+ 0.0, 1.0 :+ 0.0], [3.0 :+ 0.0, 1.0 :+ 0.0, 1.0 :+ 0.0], [3.0 :+ 0.0, 1.0 :+ 0.0, 1.0 :+ 0.0]]Here is another example: this is a complex matrix and it is not even hermitian. Yet, the algorithm still works, although its fluctuates around true values.
| 2-i 0 i | | 0 1+i 0 | | i 0 2-i | [[2.0 :+ (-1.33333), 1.0 :+ 1.0, 2.0 :+ (-0.666667)], [1.89245 :+ (-1.57849), 1.0 :+ 1.0, 2.10755 :+ (-0.421509)], [1.81892 :+ (-1.80271), 1.0 :+ 1.0, 2.18108 :+ (-0.197289)], [1.84565 :+ (-1.99036), 1.0 :+ 1.0, 2.15435 :+ (-0.00964242)], [1.93958 :+ (-2.07773), 1.0 :+ 1.0, 2.06042 :+ 0.0777281], [2.0173 :+ (-2.06818), 1.0 :+ 1.0, 1.9827 :+ 0.0681793], [2.04357 :+ (-2.02437), 1.0 :+ 1.0, 1.95643 :+ 0.0243654], [2.03375 :+ (-1.99072), 1.0 :+ 1.0, 1.96625 :+ (-0.00928429)], [2.01245 :+ (-1.97875), 1.0 :+ 1.0, 1.98755 :+ (-0.0212528)], [1.99575 :+ (-1.98307), 1.0 :+ 1.0, 2.00425 :+ (-0.0169263)], [1.98938 :+ (-1.99359), 1.0 :+ 1.0, 2.01062 :+ (-0.00640583)], [1.99145 :+ (-2.00213), 1.0 :+ 1.0, 2.00855 :+ 0.00212504], [1.9968 :+ (-2.00535), 1.0 :+ 1.0, 2.0032 :+ 0.00535265], [2.00108 :+ (-2.00427), 1.0 :+ 1.0, 1.99892 :+ 0.0042723], [2.00268 :+ (-2.00159), 1.0 :+ 1.0, 1.99732 :+ 0.00158978], [2.00213 :+ (-1.99946), 1.0 :+ 1.0, 1.99787 :+ (-0.000541867)], [2.00079 :+ (-1.99866), 1.0 :+ 1.0, 1.9992 :+ (-0.00133514)], [1.99973 :+ (-1.99893), 1.0 :+ 1.0, 2.00027 :+ (-0.00106525)], [1.99933 :+ (-1.9996) , 1.0 :+ 1.0, 2.00067 :+ (-0.000397997)], [1.99947 :+ (-2.00013), 1.0 :+ 1.0, 2.00053 :+ 0.000134972]] The true eigenvalues are 2 - 2i, 1 + i, 2Some nonsymmetric real matrices have all real eigenvalues and our algorithm still works for such cases. Here is one such an example, which traditionally would have to be treated by one of the Lanczos-like algorithms, specifically designed for nonsymmetric real matrices. Evaluation of
[[3.0, -0.70818,-0.291815], [3.06743, -3.41538, 2.34795], [3.02238, -1.60013, 0.577753], [3.00746, -2.25793, 1.25047], [3.00248, -1.88764, 0.885154], [3.00083, -2.06025, 1.05943], [3.00028, -1.97098, 0.970702], [3.00009, -2.0148, 1.01471], [3.00003, -1.99268, 0.992648], [3.00001, -2.00368, 1.00367], [3.0, -1.99817, 0.998161], [3.0, -2.00092, 1.00092], [3.0, -1.99954, 0.99954], [3.0, -2.00023, 1.00023], [3.0, -1.99989, 0.999885], [3.0, -2.00006, 1.00006], [3.0, -1.99997, 0.999971], [3.0, -2.00001, 1.00001], [3.0, -1.99999, 0.999993], [3.0, -2.0, 1.0]] The true eigenvalues are: 3, -2, 1Finally, here is a case of a nonsymmetric real matrix with complex eigenvalues:
| 2 -3 | | 1 0 |The direct application of "iterated_eigenvalues" would fail to produce expected eigenvalues:
1 + i sqrt(2) and 1 - i sqrt (2)But if we first precondition the matrix by adding "i" to its diagonal:
| 2+i -3| | 1 i|and then compute its iterated eigenvalues:
[[1.0 :+ 1.66667, 1.0 :+ 0.333333 ], [0.600936 :+ 2.34977, 1.39906 :+ (-0.349766)], [0.998528 :+ 2.59355, 1.00147 :+ (-0.593555)], [1.06991 :+ 2.413, 0.93009 :+ (-0.412998)], [1.00021 :+ 2.38554, 0.99979 :+ (-0.385543)], [0.988004 :+ 2.41407, 1.012 :+ (-0.414074)], [0.999963 :+ 2.41919, 1.00004 :+ (-0.419191)], [1.00206 :+ 2.41423, 0.99794 :+ (-0.414227)], [1.00001 :+ 2.41336, 0.99999 :+ (-0.413361)], [0.999647 :+ 2.41421, 1.00035 :+ (-0.414211)], [0.999999 :+ 2.41436, 1.0 :+ (-0.41436) ], [1.00006 :+ 2.41421, 0.99993 :+ (-0.414214)], [1.0 :+ 2.41419, 1.0 :+ (-0.414188)], [0.99999 :+ 2.41421, 1.00001 :+ (-0.414213)], [1.0 :+ 2.41422, 1.0 :+ (-0.414218)], [1.0 :+ 2.41421, 0.99999 :+ (-0.414213)], [1.0 :+ 2.41421, 1.0 :+ (-0.414212)], [1.0 :+ 2.41421, 1.0 :+ (-0.414213)], [1.0 :+ 2.41421, 1.0 :+ (-0.414213)], [1.0 :+ 2.41421, 1.0 :+ (-0.414213)]]After subtracting "i" from the last result, we will get what is expected.
Eigenvectors for distinct eigenvalues
Assuming that eigenvalues of matrix A are already found we may now attempt to find the corresponding aigenvectors by solving the following homogeneous equation
(A - a I) | x > = 0for each eigenvalue "a". The matrix
B = A - a Iis by definition singular, but in most cases it can be triangularized by the familiar "factors_QR" procedure.
B | x > = Q R | x > = 0It follows that the unknown eigenvector | x > is one of the solutions of the homogeneous equation:
R | x > = 0where R is a singular, upper triangular matrix with at least one zero on its diagonal.
| 0 1 1 3 | | x1 | | 0 1 1 2 | | x2 | /\ | 0 0 2 4 | | x3 | = 0 || | 0 0 0 0 | | x4 | ||Recall that the diagonal elements of any triangular matrix are its eigenvalues. Our example matrix has three distinct eigenvalues: 0, 1, 2. The eigenvalue 0 has degree of degeneration two. Presence of degenerated eigenvalues complicates the solution process. The complication arises when we have to make our decision about how to solve the trivial scalar equations with zero coefficients, such as
0 * x4 = 0resulting from multiplication of the bottom row by vector | x >. Here we have two choices: "x4" could be set to 0, or to any nonzero number 1, say. By always choosing the "0" option we might end up with the all-zero trivial vector -- which is obviously not what we want. Persistent choice of the "1" option, might lead to a conflict between some of the equations, such as the equations one and four in our example.
So the strategy is as follows.
If there is at least one zero on the diagonal, find the topmost row with zero on the diagonal and choose for it the solution "1". Diagonal zeros in other rows would force the solution "0". If the diagonal element is not zero than simply solve an arithmetic equation that arises from the substitutions of previously computed components of the eigenvector. Since certain inaccuracies acumulate during QR factorization, set to zero all very small elements of matrix R.
By applying this strategy to our example we'll end up with the eigenvector
< x | = [1, 0, 0, 0]
If the degree of degeneration of an eigenvalue of A is 1 then the corresponding eigenvector is unique -- subject to scaling. Otherwise an eigenvector found by this method is one of many possible solutions, and any linear combination of such solutions is also an eigenvector. This method is not able to find more than one solution for degenerated eigenvalues. An alternative method, which handles degenerated cases, will be described in the next section.
The function below calculates eigenvectors corresponding to distinct selected eigenvalues of any square matrix A, provided that the singular matrix B = A - a I can still be factorized as Q R, where R is an upper triangular matrix.
> eigenkets :: (Scalar a, Fractional a) => [[a]] -> [a] -> [[a]] > eigenkets a u > -- > -- List of eigenkets of a square matrix A > -- where > -- a is a list of columns of A > -- u is a list of eigenvalues of A > -- (This list does not need to be complete) > -- > | null u = [] > | not (null x') = x':(eigenkets a (tail u)) > | otherwise = (eigenket_UT (reverse b) d []):(eigenkets a (tail u)) > where > a' = add_to_diagonal (-(head u)) a > x' = unit_ket a' 0 (length a') > b = snd (factors_QR a') > d = discriminant [head bk | bk <- b] 1 > discriminant v n > | null v = [] > | otherwise = x : (discriminant (tail v) m) > where > (x, m) > | (head u) == 0 = (n, 0) > | otherwise = (n, n) > eigenket_UT c e xs > | null c = xs > | otherwise = eigenket_UT (tail c) (tail e) (x:xs) > where > x = solve_row (head c) (head e) xs > > solve_row (v:vs) n x > | almostZero v = n > | otherwise = q/v > where > q > | null x = 0 > | otherwise = -(sum_product vs x) > > unit_ket b' m n > | null b' = [] > | all (== 0) (head b') = unit_vector m n > | otherwise = unit_ket (tail b') (m+1) n
Eigenvectors for degenerated eigenvalues
Few facts:
| 7 -2 1 | A = | -2 10 -2 | | 1 -2 7 |has two distinct eigenvalues: 12 and 6 -- the latter being degenerated with degree of two. Two corresponding eigenvectors are:
< x1 | = [1, -2, 1] -- for 12 < x2 | = [1, 1, 1] -- for 6It happens that those vectors are orthogonal, but this is just an accidental result. However, we are missing a third distinct eigenvector. To find it we need another method. One possibility is presented below and the explanation follows.
Let us assume a trial vector | x' >, such that> eigenket' :: (Scalar a, Fractional a) => [[a]] -> a -> a -> [a] -> [a] > eigenket' a alpha eps x' = > -- > -- Eigenket of matrix A corresponding to eigenvalue alpha > -- where > -- a is a list of columns of matrix A > -- eps is a trial inaccuracy factor > -- artificially introduced to cope > -- with singularities of A - alpha I. > -- One might try eps = 0, 0.00001, 0.001, etc. > -- x' is a trial eigenvector > -- > scaled [xk' - dk | (xk', dk) <- zip x' d] > where > b = add_to_diagonal (-alpha*(1+eps)) a > d = one_ket_solution b y > y = matrix_ket (transposed b) x'
| x' > = | x > + | d > where | x > is an eigenvector we seek | d > is an error of our estimation of | x >We first form a matrix B, such that:
B = A - alpha Iand multiply it by the trial vector | x' >, which results in a vector | y >
B | x' > = |y >On another hand:
B | x' > = B | x > + B | d > = B | d > because B | x > = A | x > - alpha | x > = 0Comparing both equations we end up with:
B | d > = | y >that is: with the system of linear equations for unknown error | d >. Finally, we subtract error | d > from our trial vector | x' > to obtain the true eigenvector | x >.
But there is some problem with this approach: matrix B is by definition singular, and as such, it might be difficult to handle. One of the two processes might fail, and their failures relate to division by zero that might happen during either the QR factorization, or the solution of the triangular system of equations.
But if we do not insist that matrix B should be exactly singular, but almost singular:
B = A - alpha (1 + eps) Ithen this method might succeed. However, the resulting eigenvector will be the approximation only, and we would have to experiment a bit with different values of "eps" to extrapolate the true eigenvector.
The trial vector | x' > can be chosen randomly, although some choices would still lead to singularity problems. Aside from this, this method is quite versatile, because:
Auxiliary functions
The functions below are used in the main algorithms of this module. But they can be also used for testing. For example, the easiest way to test the usage of resources is to use easily definable unit matrices and unit vectors, as in:
one_ket_solution (unit_matrix n::[[Double]]) (unit_vector 0 n::[Double]) where n = 20, etc.> unit_matrix :: Num a => Int -> [[a]] > unit_matrix m = > -- > -- Unit square matrix of with dimensions m x m > -- > [g 0 k | k <- [0..(m-1)]] > where > g i k > | i == m = [] > | i == k = 1:(g (i+1) k) > | otherwise = 0:(g (i+1) k) >> unit_vector :: Num a => Int -> Int -> [a] > unit_vector i m = > -- > -- Unit vector of length m > -- with 1 at position i, zero otherwise > [g i k| k <- [0..(m-1)]] > where > g j k > | j == k = 1 > | otherwise = 0> diagonals :: [[a]] -> [a] > diagonals a = > -- > -- Vector made of diagonal components > -- of square matrix a > -- > diagonals' a 0 > where > diagonals' b n > | null b = [] > | otherwise = > (head $ drop n $ head b) : (diagonals' (tail b) (n+1))
----------------------------------------------------------------------------- -- -- Copyright: -- -- (C) 1998 Numeric Quest Inc., All rights reserved -- -- Email: -- -- jans@numeric-quest.com -- -- License: -- -- GNU General Public License, GPL -- -----------------------------------------------------------------------------