N-dimensional tensors

***

N-dimensional tensors

Literate Haskell module Tensor.lhs

Jan Skibinski, Numeric Quest Inc., Huntsville, Ontario, Canada

1999.10.08, last modified 1999.10.16

This is a quick sketch of what might be a basis of a real Tensor module. This module has quite a few limitations (listed below). I'd like to get some feedback on what should be a better way to design it properly. Nevertheless, this module works and is able to tackle complex and mundane manipulations in the very straightforward way.

There are few arbitrary decisions we have taken. For example, we consider a scalar to be a tensor of rank 0. This forces us to do conversions between true scalars and such tensors, but it also saves us a lot of headache related to typing restrictions. This is a typical price paid for (too much?) generalization.

To get rid of those awful sums appearing in multiplications of tensors we do introduce Einstein's summation convention by the way of text examples -- followed by the equivalent Haskell examples. Hopefully it is clear and be well appreciated for its economy of notation, which is standard in the tensor calculus.

Datatype Tensor defined here is an instance of class Eq, Show and Num. That means that one can compare tensors for equality and perform basic numerical calculations, such as addition, negation, subtraction, multiplication, etc. -- using standard notation (==), (/=), (+), (-), (*). In addition, several customized operations, such as (<*>) and (<<*>>) are defined for variety of inner products.

Limitations of this module:

Tensor components are Doubles. Why not Fraction, Complex, etc? For a moment we will leave this question aside, and return to it some time later. But we consider it the important question -- which is evident from the attempts of such generalization in some of our other modules: Orthogonals and Fraction.

We are well aware that the decision to represent tensors as nested objects will have significant impact on access (and update -- if supported) of such data structure. Linear arrays seem to be better suited for such tasks, where all indices must be explicitely computed first, but the access time is linear. In contrary, the hierarchical data structure defined here require very little effort in index computing but the access time depends on the depth of the data tree.
But speed has not been tested yet, so we really do not know how inefficient this module is and all of the above is just a pure speculation. Certain operations of this module seem to be quite well matched with this tree-like data structure, and because of it this design decision might be not so bad after all.

The shape of tensors defined here involves two parameters: dimension and rank. Rank is associated with the depth of the tensor tree and corresponds to a total number of indices by which you can access the individual components. No limits are imposed on ranks and there are binary operations which involve tensors of different ranks. Dimension is associated with the breadth of the tree and correspond to a number of values each index can take. Dimension is fixed via constant dims. At first it might seem as a severe limitation, but in fact one should never mix tensors with different dimensions. One usually works either with three-dimensional tensors (classical mechanics, electrodynamics, elasticity, etc.) or the four-dimentional tensors (relativity theory).

Tensor datatype


> module Tensor where
> import Data.Array(inRange)
> infixl 9 #      -- used for tensor indexing
> infixl 9 ##     -- used for indices expressed as lists
> infixl 7 <*>    -- inner product with one bound
> infixl 7 <<*>>  -- inner product with two bounds

Tensor can contain a scalar value or a list of tensors. This recursively defines tensor of any rank in n-D space.


> data Tensor = S Double
>             | T [Tensor]

[Tensor]

        data Tensor = S Double | T Tensor Tensor Tensor

Rank is either 0 (scalars), 1 (vectors), or higher: 2, 3, 4 ...


> rank :: Tensor -> Int
> rank t = rank' 0 t where
>       rank' n (S _)     = n
>       rank' n (T xs)    = rank' (n+1) (head xs)


> dims :: Int
> dims = 3

Showing

Tensors are printed as recursive lists with a word "Tensor" prepended


> instance Show Tensor where
>       showsPrec 0 (S a)     = showString "Tensor " . showsPrec 0 a
>       showsPrec n (S a)     = showsPrec n a

>       showsPrec 0 (T xs)    = showString "Tensor " . showList' 0 xs
>       showsPrec n (T xs)    = showList' n xs

> showList' :: (Show t) => Int -> [t] -> String -> String
> showList' _ [] = showString "[]"
> showList' n (x:xs) = showChar '[' . showsPrec (n+1) x . showRem (n+1) xs
>       where
>               showRem _ [] = showChar ']'
>               showRem o (y:ys) = showChar ',' . showsPrec o y . showRem o ys

Input

Although tensors are printed as structured list it is easier to input data via flat lists. But make sure that the length of the list is one of: dims^0, dims^1, dims^2, dims^3, dims^4, etc.

This function is quite inefficient for ranks higher than 4. Compare, for example, timings of:

        tensor [1..3^6]
        tensor [1..3^3] * tensor [1..3^3]

tensor



> tensor :: [Double] -> Tensor
> tensor xs
>       | size == 1 = S (head xs)
>       | q /= 0    = error "Length is not a power of dims"
>       | otherwise = T (tlist p xs)
>       where
>           (p,q) = rnk 1 (quotRem size dims)
>           rnk m (1, v) = (m, v)
>           rnk m (u, 0) = rnk (m+1) (quotRem u dims)
>           rnk m (_, v) = (m, v)
>           size   = length xs
>           group n ys = group' n ys [] where
>               group' o zs as
>                   | length zs == 0 = reverse as
>                   | length zs < o  = reverse (zs:as)
>                   | otherwise      = group' o (drop o zs) ((take o zs):as)
>
>           tlist :: Int -> [Double] -> [Tensor]
>           tlist 1 zs   = map S zs
>           tlist rnl zs = tlist' (rnl-1) (map S zs)
>               where
>                   tlist' 0 fs = fs
>                   tlist' o fs = tlist' (o-1) $ map T $ group dims fs

Extraction and conversion

Tensor components are also tensors and can be extracted via (#) operator


> ( # ) :: Tensor -> Int -> Tensor
> (S a1) # 1  = S a1
> (S _) # _  = error "out of range"
> (T xs) # i  = xs!!(i-1)

> ( ## ) :: Tensor -> [Int] -> Tensor
> a ## [] = a
> a ## (x:xs) = (a#x) ## xs


> scalar :: Tensor -> Double
> scalar (S a)  = a
> scalar (T _) = error "rank not 0"


> vector :: Tensor -> [Double]
> vector (S _)         = error "rank not 1"
> vector a@(T xs)
>       | rank a /= 1  = error "rank not 1"
>       | otherwise    = map scalar xs

Useful tensors: epsilon and delta

Function "epsilon' i j k" emulates values of the pseudo-tensor Eijk. It is valid only for three-dimensional tensors. It takes three indices i,j,k from the range (1,3) and returns one of the three values: 0.0, 1.0, -1.0 -- depending on the rules specified below:


> epsilon' :: Int -> Int -> Int -> Double
> epsilon' i j k
>       | dims /= 3 = error "not 3-dims"
>       | outside (1,3) i j k = error "Not in range"
>       | (i == j) || (i == k) || (j == k)   =  0
>       | otherwise = epsilon1 i j k
>       where
>               epsilon1 m n o
>                       | (m == 1) && (n == 2) && (o == 3)   =  1
>                       | (m == 3) && (n == 2) && (o == 1)   = -1
>                       | otherwise = epsilon1 n o m
>               outside (p,q) a b c =
>                       (not $ inRange (p,q) a) ||
>                       (not $ inRange (p,q) b) ||
>                       (not $ inRange (p,q) c)


> delta' :: Int -> Int -> Double
> delta' i j
>       | i == j    = 1
>       | otherwise = 0


> delta, epsilon :: Tensor
> delta   = tensor [delta' i j     | i <- [1..dims], j <- [1..dims]]
> epsilon = tensor [epsilon' i j k | i <- [1..3], j <- [1..3], k <- [1..3]]

        scalar (epsilon#1#2#3) = 1
        scalar (epsilon#1#1#3) = 0,
        scalar (epsilon#3#2#1) = -1

Dot product

Dot product of two tensors of rank 1 could be defined as tensor of rank 0. This is not the most efficient implementation but we still want the dot product to be recognised as tensor, so we loose on speed here:


> dot :: Tensor -> Tensor -> Tensor
> dot a b = S (sum [scalar (a#i) * scalar (b#i) | i <- [1..dims]])

Cross product - valid for 3D space only

The cross product of two vectors is another vector: C = A x B. The pseudotensor Eijk is used to compute such cross product.

First, here are numerical components of C, C[i]:


> cross'       :: Tensor -> Tensor -> Int -> Double
> cross' a b i = sum [(epsilon' i j k)* scalar (a#j) * scalar (b#k)|
>                       j<-[1..3],k<-[1..3], j/=k]


> cross     :: Tensor -> Tensor -> Tensor
> cross a b = tensor (map (cross' a b) [1..3])

        cross (tensor [1..3]) (tensor [1,8,1]) ==> Tensor [-22.0, 2.0, 6.0]

Equality of tensors

Tensor can be admitted to class Eq. We only need to define either equality or nonequality operation. We've chosen to define the former: two tensors are equal if they have the same rank and equal components:


> instance Eq Tensor where
>       (==) a b
>               | ranka /= rank b = False
>               | ranka == 0      = scalar a == scalar b
>               | otherwise       = and [(a#i) == (b#i) | i <- [1..dims]]
>               where
>                       ranka = rank a
>

Tensor as instance of class Num

To admit tensors to class Num we have to support all the operations from that class. Here is the class Num declaration taken from the Prelude:

class (Eq a, Show a) => Num a where
    (+), (-), (*)  :: a -> a -> a
    negate         :: a -> a
    abs, signum    :: a -> a
    fromInteger    :: Integer -> a

    -- Minimal complete definition: All, except negate or (-)
    x - y           = x + negate y
    negate x        = 0 - x

(*)

c = a * b

c

        rank c = rank a + rank b

        a * b /= b * a

Num


> instance Num Tensor where
>       (+) a b
>               | ranka /= rank b = error "different ranks"
>               | ranka == 0      = S (scalar a  + scalar b)
>               | otherwise       = T [a#i + b#i | i <- [1..dims]]
>               where
>                       ranka = rank a

>       negate (S a1)           = S (negate a1)
>       negate (T xs)           = T (map negate xs)

>       abs (S a1)              = S (abs a1)
>       abs (T xs)              = T (map abs xs)

>       signum (S a1)           = S (signum a1)
>       signum (T xs)           = T (map signum xs)

>       fromInteger n             = S (fromInteger n)

>       (*) (S a1) (S b1)     = S (a1*b1)
>       (*) a@(S _) (T xs)     = T (map (a*) (take dims xs))
>       (*) (T xs) b            = T (map (*b) (take dims xs))

(*)

a

b

        c = a * b, that is
        c[ijk] = a[ij] b[k]

Contraction

Eistein's indexing convention of tensors is based on the distinction between free indices and bound indices. Free indices appear in the tensorial expressions, such as A[ijkl], once only and they indicate a freedom for substitution of any specific index from the range of valid indices. This range is (1,3) for 3D tensors. The expression A[ijkl] represents in fact one of 3^4 possible components of the tensor A.

Bound indices, on the other hand, appear in pairs (and only in pairs) and they indicate the summation of tensor expression over the valid range. For example,

        A[kkj] = A[11j] + A[22j] + A[33j]

A process of converting of a pair of free indices to a pair of bound indices is called contraction. As a result a rank of a tensor (or expression involving several tensors) is being reduced by two.

The function contract below accepts a tensor of a rank bigger or equal 2 and two integers m,n from the range (1,rank a) which indicate positions of the two indices to be used for contraction. The result is a tensor with its rank reduced by two.



> contract :: Int -> Int -> Tensor -> Tensor
> contract m n a
>    | m >= n      = error "wrong ordering"
>    | outside m n = error "not in range"
>    | ranka <  2  = error "cannot contract"
>    | ranka == 2  = S (sum [scalar (a#i#i) | i <- [1..dims]])
>    | ranka >  2  = tensor [summa m n us a | us <- freeIndices (ranka-2)]
>    where
>        ranka = rank a
>
>        outside p q = (not $ inRange (1,ranka) p)
>                            ||(not $ inRange (1,ranka) q)
>        summa p q xs b = sum [scalar (b##(insert p q xs r)) |
>               r <- [1..dims]]

>        -- Insert element r at positions m n to the list
>        -- of indices xs
>        insert o p xs r = us++[r]++ws++[r]++zs
>               where
>                       (us,vs) = splitAt (o-1) xs
>                       (ws,zs) = splitAt (p - o - 1) vs
>
>        freeIndices 1 = [[x] | x <- [1..dims]]
>        freeIndices o = [x:y | x <- [1..dims], y <- freeIndices (o-1)]

delta

        delta [kk] = delta[1,1] + delta[2,2] + delta[3,3] = 1 + 1 + 1 = 3

        contract 1 2 delta        ==> Tensor 3.0
        rank (contract 1 2 delta) ==> 0

Inner product

The inner product of two tensors can be considered as two-phase process: first the outer product is formed and then a contraction is applied to a selected pair of indices. There are countless possibilities of defining such inner products, since we can choose any pair, or even more than one pair, of indices to become bound.

How do we usually multiply tensors? Here is one example, which is equivalent to matrix-vector multiplication:

        C[i] = A[ij] B[j]

        C[1] = A[1j] B[j]
        C[2] = A[2j] B[j]
        C[3] = A[3j] B[j]

        C[1] = A[11] B[1] + A[12] B[2] + A[13] B[3]
        C[2] = A[21] B[1] + A[22] B[2] + A[23] B[3]
        C[3] = A[31] B[1] + A[32] B[2] + A[33] B[3]

To obtain the above result we will first form the outer product of matrix A and vector B, obtain a tensor of rank 3, and then contract it in indices 2 and 3 to obtain a the final expected result (inner product):

        c = contract 2 3 (a * b)

The system of equations

        C[i] = A[ij] B[j]

        c i = sum [scalar(a#i#j) * scalar(b#j) | j <- [1..dims]]
        -- valid for i = 1..dims

contract

<*>

        c      = a <*> b              -- the output is a tensor of rank 1
        c'  i  = (a <*> b)#i          -- the output is a tensor of rank 0
        c'' i  = scalar ((a <*> b)#i) -- the output is a number

Convenience operators for inner products

Variety of specialized functions for inner products could be defined. We will show few examples here and introduce specialized convenience operators for most common types of inner products. Please note that the proposed operators are not standard in any way, and we are not trying to suggest that they are important. Just treat them as examples.

The semantics of operator <*> has been chosen to support matrix-vector or vector-matrix multiplications. But this operator is more general than that, because it also handles products with scalars (tensors of rank 0), and generally any products of any two tensors with bounds imposed on one pair of indices: last index of the first tensor and first index of the second tensor.


> (<*>) :: Tensor -> Tensor -> Tensor
> a <*> b
>       | (ranka == 0) || (rankb == 0) = a * b
>       | otherwise = contract ranka (ranka + 1) (a * b)
>       where
>               ranka = rank a
>               rankb = rank b

        A[i] = delta[ij] B[j], where delta is a Kronecker's delta

        delta <*> tensor [4,5,6])    ==> Tensor [4.0, 5.0, 6.0]
        (delta <*> tensor [4,5,6])#1 ==> Tensor 4.0

        S[ij] = C[ijkl] G[kl]


> (<<*>>) :: Tensor -> Tensor -> Tensor
> a <<*>> b
>       | (ranka < 2) || (rankb < 2) = error "rank too small"
>       | otherwise = contract (ranka-1) ranka
>               (contract ranka (ranka+2) (a * b))
>       where
>               ranka = rank a
>               rankb = rank b

        tensor [1..81] <<*>> tensor [1..9]

                ==> s = Tensor [[ 285.0,  690.0, 1095.0],
                                [1500.0, 1905.0, 2310.0],
                                [2715.0, 3120.0, 3525.0]]

        (tensor [1..81] <<*>> tensor [1..9])#1#1 = Tensor 285.0

Double cross products

Here is another useful example of tensor multiplication. Say you want to compute a cross product of three vectors:

        D = C X (A x B )

        D[i] = E[ijk] C[j] E[kpq] A[p] B[q]

        E[ijk] = E[kij]

        D[i] = E[kij] E[kpq] C[j] A[p] B[q]

        E[kij] E[kpq] = delta[ip] delta[jq] - delta[iq] delta[jp]

delta[ij] G[j] = G[i]

C x (A x B)

        D[i] = C[j] B[j] A[i] - C[j] A[j] B[i]

You should easily recognize that C[j] B[j] represents the scalar product. Therefore our double cross product can be represented as a difference of two vectors:

        D = C x (A x B) = (C o B) A - (C o A) B


> d_standard :: Tensor
> d_standard  = cross c (cross a b) where
>       a = tensor [1,2,3]
>       b = tensor [3,1,8]
>       c = tensor [5,2,4]

        D = (C o B) A - (C o A) B


> d_simpler :: Tensor
> d_simpler =
>       tensor [n1 * scalar (a#i) - n2 * scalar (b#i) | i <- [1..dims]] where
>
>               a = tensor [1,2,3]
>               b = tensor [3,1,8]
>               c = tensor [5,2,4]
>               n1 = scalar (c `dot` b)
>               n2 = scalar (c `dot` a)

d_standard

d_simpler

        ==> Tensor [-14.0, 77.0, -21.0]

Vector transformation

A vector can be decomposed in any system of reference. The best choice is any orthogonal system of reference, where all base unit vectors are mutually perpendicular (orthogonal), since this simplifies the computations. The base vectors e[1], e[2], e[3] are usually chosen as vectors of length one (we say that they are normalized to one), and hence they are called "orthonormal". They obey the orthonormality relations for their scalar products:

        e[i] o e[j] = delta[ij]

Here is an example of the vector decomposition:

        A = A[i] e[i]     (summation over "i"!)

        A'[i] e'[i] = A[i] e[i]

e'[k]

        A'[i] e'[k] o e'[i] = A[i] e'[k] o e[i]

        e'[k] o e'[i] = delta[ki]

        A'[i] delta[ki] = A'[k]

        R[ki] = e'[k] o e[i]

        A'[k] = R[ki] A[i]

You might want to run some exercise choosing the old system with the base vectors:

        e#1=tensor [1,0,0]
        e#2=tensor [0,1,0]
        e#3=tensor [0,0,1],

        e = tensor [1,0,0,
                    0,1,0,
                    0,0,1]


        r     = tensor [scalar (e'#k `dot` e#i)|k<-[1..dims], i<-[1..dims]]

<*>

        a' = r <*> a

Related page on this site: Collection of Haskell modules

-----------------------------------------------------------------------------
--
-- Copyright:
--
--      (C) 1999 Numeric Quest Inc., All rights reserved
--
-- Email:
--
--      jans@numeric-quest.com
--
-- License:
--
--      GNU General Public License, GPL
--
-----------------------------------------------------------------------------