| Safe Haskell | Safe-Inferred | 
|---|---|
| Language | Haskell2010 | 
Data.Repa.Eval.Generic.Par
Description
Generic parallel array computation operators.
Synopsis
- fillChunked :: Gang -> (Int# -> a -> IO ()) -> (Int# -> a) -> Int# -> IO ()
 - fillChunkedIO :: Gang -> (Int# -> a -> IO ()) -> (Int# -> IO (Int# -> IO a)) -> Int# -> IO ()
 - fillBlock2 :: Elt a => Gang -> (Int# -> a -> IO ()) -> (Int# -> Int# -> a) -> Int# -> Int# -> Int# -> Int# -> Int# -> IO ()
 - fillInterleaved :: Gang -> (Int# -> a -> IO ()) -> (Int# -> a) -> Int# -> IO ()
 - fillCursoredBlock2 :: Elt a => Gang -> (Int# -> a -> IO ()) -> (Int# -> Int# -> cursor) -> (Int# -> Int# -> cursor -> cursor) -> (cursor -> a) -> Int# -> Int# -> Int# -> Int# -> Int# -> IO ()
 - foldAll :: Gang -> (Int# -> a) -> (a -> a -> a) -> a -> Int# -> IO a
 - foldInner :: Gang -> (Int# -> a -> IO ()) -> (Int# -> a) -> (a -> a -> a) -> a -> Int# -> Int# -> IO ()
 
Filling
Arguments
| :: Gang | Gang to run the operation on.  | 
| -> (Int# -> a -> IO ()) | Update function to write into result buffer.  | 
| -> (Int# -> a) | Function to get the value at a given index.  | 
| -> Int# | Number of elements.  | 
| -> IO () | 
Fill something in parallel.
- The array is split into linear chunks, and each thread linearly fills one chunk.
 
Arguments
| :: Gang | Gang to run the operation on.  | 
| -> (Int# -> a -> IO ()) | Update function to write into result buffer.  | 
| -> (Int# -> IO (Int# -> IO a)) | Create a function to get the value at a given index. The first argument is the thread number, so you can do some per-thread initialisation.  | 
| -> Int# | Number of elements.  | 
| -> IO () | 
Fill something in parallel, using a separate IO action for each thread.
- The array is split into linear chunks, and each thread linearly fills one chunk.
 
Arguments
| :: Elt a | |
| => Gang | |
| -> (Int# -> a -> IO ()) | Update function to write into result buffer.  | 
| -> (Int# -> Int# -> a) | Function to evaluate the element at an (x, y) index.  | 
| -> Int# | Width of the whole array.  | 
| -> Int# | x0 lower left corner of block to fill  | 
| -> Int# | y0  | 
| -> Int# | w0 width of block to fill.  | 
| -> Int# | h0 height of block to fill.  | 
| -> IO () | 
Fill a block in a rank-2 array in parallel.
- Blockwise filling can be more cache-efficient than linear filling for rank-2 arrays.
 - Coordinates given are of the filled edges of the block.
 - We divide the block into columns, and give one column to each thread.
 - Each column is filled in row major order from top to bottom.
 
Arguments
| :: Gang | Gang to run the operation on.  | 
| -> (Int# -> a -> IO ()) | Update function to write into result buffer.  | 
| -> (Int# -> a) | Function to get the value at a given index.  | 
| -> Int# | Number of elements.  | 
| -> IO () | 
Fill something in parallel, using a round-robin order.
- Threads handle elements in row major, round-robin order.
 - Using this method helps even out unbalanced workloads.
 
Arguments
| :: Elt a | |
| => Gang | Gang to run the operation on.  | 
| -> (Int# -> a -> IO ()) | Update function to write into result buffer.  | 
| -> (Int# -> Int# -> cursor) | Make a cursor from an (x, y) index.  | 
| -> (Int# -> Int# -> cursor -> cursor) | Shift the cursor by an (x, y) offset.  | 
| -> (cursor -> a) | Function to evaluate the element at an index.  | 
| -> Int# | Width of the whole array.  | 
| -> Int# | x0 lower left corner of block to fill  | 
| -> Int# | y0  | 
| -> Int# | w0 width of block to fill  | 
| -> Int# | h0 height of block to fill  | 
| -> IO () | 
Fill a block in a rank-2 array in parallel.
- Blockwise filling can be more cache-efficient than linear filling for rank-2 arrays.
 - Using cursor functions can help to expose inter-element indexing computations to the GHC and LLVM optimisers.
 - Coordinates given are of the filled edges of the block.
 - We divide the block into columns, and give one column to each thread.
 - We need the 
Eltconstraint so that we can use itstouchfunction to provide an order of evaluation ammenable to the LLVM optimiser. You should compile your Haskell program with-fllvm -optlo-O3to enable LLVM's Global Value Numbering optimisation. 
Reduction
Arguments
| :: Gang | Gang to run the operation on.  | 
| -> (Int# -> a) | Function to get an element from the source.  | 
| -> (a -> a -> a) | Binary associative combining function.  | 
| -> a | Starting value.  | 
| -> Int# | Number of elements.  | 
| -> IO a | 
Parallel tree reduction of an array to a single value. Each thread takes an equally sized chunk of the data and computes a partial sum. The main thread then reduces the array of partial sums to the final result.
We don't require that the initial value be a neutral element, so each thread computes a fold1 on its chunk of the data, and the seed element is only applied in the final reduction step.
Arguments
| :: Gang | Gang to run the operation on.  | 
| -> (Int# -> a -> IO ()) | Function to write into the result buffer.  | 
| -> (Int# -> a) | Function to get an element from the source.  | 
| -> (a -> a -> a) | Binary associative combination operator.  | 
| -> a | Neutral starting value.  | 
| -> Int# | Total length of source.  | 
| -> Int# | Inner dimension (length to fold over).  | 
| -> IO () | 
Parallel reduction of a multidimensional array along the innermost dimension. Each output value is computed by a single thread, with the output values distributed evenly amongst the available threads.