lazyarray-0.1.3: Efficient implementation of lazy monolithic arrays (lazy in indexes).

Portabilitynon-portable (uses Data.Array.IO)
Stabilityexperimental
Maintainerfox@ucw.cz

Data.LazyArray

Contents

Description

This is an implementation of monolithic lazy array. If you want to know what an monolithic lazy array is good for, you can read for example the paper "Efficient Graph Algorithms Using Lazy Monolithic Arrays" (http://citeseer.ist.psu.edu/95126.html).

Module Data.LazyArray implements the monolithic lazy array using Array with temporary buffer. For implementation without the temporary buffer see Data.LazyArray.Lowlevel module.

Don't forget to import some kind of arrays when using this module. Results of small benchmark and a discussion can be found at the bottom of this page.

Synopsis

Generic lazy arrays

lArraySource

Arguments

:: Ix i 
=> (i, i)

bounds of the array

-> [(i, e)]

list of associations

-> Array i [e]

resulting array

This function returns an array, which i-th element is a list, that contains all values associated with i in the list of associations, in that order. In other words, the result of lArray bnds assoc is

lArray bnds assoc = array bnds [(i,[e|(i',e)<-assoc,i'==i] | i<-range bnds].

The important difference between the two sides of previous equation is that lArray works in linear time, ie. each member in the assoc list is accessed exactly once, and lArray is lazy in its second argument. It means the assoc list is left intact until an i-th element of the resulting array needs to be evaluated. At that time the assoc list is examined until the first pair (i,e') is found. Until such pair is found, processed elements of assoc are stored at appropriate indexes in the resulting array.

The implementation uses temporary array for such processed elements of assoc list, that were not yet 'requested' [or 'evaluated'] in the resulting array. This array can be freed when all lists of the resulting array are fully evaluated.

Here is an example of how to use the monolithic lazy array. Given a graph of type Array Int [Int] we construct the dfs numbering from a given vertex. Obviously, only the first element of the lArray is needed - lArrayFirst, lArrayMaybe or Data.LazyArray.Lowlevel are better choice here.

dnum::Array Int [Int]->Int->Array Int (Maybe Int)
dnum g s = amap listToMaybe marks where
   list = dfs' [s] 0
   marks = lArray (bounds g) list
   dfs' []     _ = []
   dfs' (s:ss) n = (s,n) : if n == head (marks!s) then dfs' ((g!s)++ss) (n+1) else dfs' ss n

lArrayMapSource

Arguments

:: Ix i 
=> ([e] -> e')

function to apply to the list of elements belonging to the same index

-> (i, i)

bounds of the array

-> [(i, e)]

list of associations

-> Array i e'

resulting array

It is often needed to apply some function to the list of elements belonging to the same index. Function lArrayMap is provided, and could be defined like lArrayMap f = (amap f) . lArray. Obviously, lArray = lArrayMap id.

Specialised lazy arrays

Only the first element from the list of elements belonging to the same element is often needed. In that case you can use one of the following methods:

lArrayMaybeSource

Arguments

:: Ix i 
=> (i, i)

bounds of the array

-> [(i, e)]

list of associations

-> Array i (Maybe e)

resulting array

This is equivalent to lArrayMaybe = lArrayMap listToMaybe but a bit faster.

lArrayFirstSource

Arguments

:: Ix i 
=> e

'zero' element,

-> (i, i)

bounds of the array

-> [(i, e)]

list of associations

-> Array i e

resulting array

This is equivalent to lArrayFirst zero = lArrayMap (\x->case x of []->zero; e:_->e) but a bit faster.

Benchmark

For comparison, various implementations of DFS numbering have been benchmarked. The benchmark can be found in the lazyarray package, in the file bench/bench.hs. The DFS was implemented using Data.LazyArray, Data.LazyArray.Lowlevel, Data.Array.Diff, Data.Map, Data.IntMap and in the Control.Monad.ST monad using Data.Array.ST.runSTArray and Data.Array.ST.runSTUArray. Three graphs were used, a path, a graph with 20 edges at each vertex, and a complete graph; each graph was used once as connected and once with isolated vertex iv. The DFS number of the isolated vertex was never asked for. Details can be found in the source file. This whole benchmark is written in Haskell, so it is only approximate.

 DFS implemented using     | path |path+iv|20-path|20path+iv|cmplt gph|cmplt gph+iv
 --------------------------|------|-------|-------|---------|---------|-------------
 lArrayFirst               | 5456 |  5572 |  1956 |   1968  |   1316  |    1296
 lArrayFirst with lArrayMap| 5248 |  5248 |  3456 |   3452  |   3072  |    3040
 lArrayMaybe               | 5772 |  5724 |  2044 |   2064  |   1308  |    1312
 lArrayMaybe with lArrayMap| 5380 |  5352 |  3560 |   3564  |   3048  |    3024
 Data.LazyArray.Lowlevel.laCreate                  | 4560 |  4700 |  4984 |   9360  |   5280  |   11140
 Data.LazyArray.Lowlevel.mlaCreate                 | 5308 |  5824 |  5972 |  11508  |   6328  |   13712
 Data.Array.Diff           |29089 | 29173 | 14652 |  14144  |  15072  |   14988
 Data.Map                  | 7216 |  7292 |  4892 |   4912  |   3316  |    3300
 Data.IntMap               | 3328 |  3280 |  2616 |   2640  |   2492  |    2524
 Data.Array.ST.runSTArray                | 1112 |  1116 |  1188 |   1384  |   1460  |    1468
 Data.Array.ST.runSTUArray               |  688 |   692 |  1164 |   1160  |   1444  |    1440

The numbers are elapsed time in ms, but use these numbers only to compare implementations in one column. The first two columns are quite inconclusive, because the DFS does not really do much when graph is a path and Data.Map and Data.IntMap implementations don't do exactly the same as the others implementations. The Data.LazyArray.Lowlevel.laCreate and Data.LazyArray.Lowlevel.mlaCreate perform very badly in the presence of isolated vertex - Data.LazyArray.Lowlevel.laFreeze tries hard to find a connection to the isolated vertex even if it is not asked for.