monad-par-0.1: A library for parallel programming based on a monad

Control.Monad.Par

Contents

Description

This module provides a monad Par, for speeding up pure computations using parallel processors. It cannot be used for speeding up computations that use IO (for that, see Control.Concurrent). The result of a given Par computation is always the same - ie. it is deterministic, but the computation may be performed more quickly if there are processors available to share the work.

For example, the following program fragment computes the values of (f x) and (g x) in parallel, and returns a pair of their results:

  runPar $ do
      fx <- pval (f x)  -- start evaluating (f x)
      gx <- pval (g x)  -- start evaluating (g x)
      a <- get fx       -- wait for fx
      b <- get gx       -- wait for gx
      return (a,b)      -- return results

Par can be used for specifying pure parallel computations in which the order of the computation is not known beforehand. The programmer specifies how information flows from one part of the computation to another, but not the order in which computations will be evaluated at runtime. Information flow is described using variables called IVars, which support put and get operations. For example, suppose you have a problem that can be expressed as a network with four nodes, where b and c require the value of a, and d requires the value of b and c:

                       a
                      / \  
                     b   c
                      \ /
                       d

Then you could express this in the Par monad like this:

   runPar $ do
       [a,b,c,d] <- sequence [new,new,new,new]
       fork $ do x <- get a; put b (x+1)
       fork $ do x <- get a; put c (x+2)
       fork $ do x <- get b; y <- get c; put d (x+y)
       fork $ do put a (3 :: Int)
       get d

The result of the above computation is always 9. The get operation waits until its input is available; multiple puts to the same IVar are not allowed, and result in a runtime error. Values stored in IVars are usually fully evaluated (although there are ways provided to pass lazy values if necessary).

In the above example, b and c will be evaluated in parallel. In practice the work involved at each node is too small here to see the benefits of parallelism though: typically each node should involve much more work. The granularity is completely under your control - too small and the overhead of the Par monad will outweigh any parallelism benefits, whereas if the nodes are too large then there might not be enough parallelism to use all the available processors.

Unlike Control.Parallel, in Control.Monad.Par parallelism is not combined with laziness, so sharing and granulairty are completely under the control of the programmer. New units of parallel work are only created by fork, par, and a few other combinators.

The implementation is based on a work-stealing scheduler that divides the work as evenly as possible betwen the available processors at runtime.

Synopsis

The Par monad

runPar :: Par a -> aSource

fork :: Par () -> Par ()Source

forks a computation to happen in parallel. The forked computation may exchange values with other computations using IVars.

Communication: IVars

data IVar a Source

Instances

NFData (IVar a) 

new :: Par (IVar a)Source

creates a new IVar

newFull :: NFData a => a -> Par (IVar a)Source

creates a new IVar that contains a value

newFull_ :: a -> Par (IVar a)Source

creates a new IVar that contains a value (head-strict only)

get :: IVar a -> Par aSource

read the value in a IVar. The get can only return when the value has been written by a prior or parallel put to the same IVar.

put :: NFData a => IVar a -> a -> Par ()Source

put a value into a IVar. Multiple puts to the same IVar are not allowed, and result in a runtime error.

put fully evaluates its argument, which therefore must be an instance of NFData. The idea is that this forces the work to happen when we expect it, rather than being passed to the consumer of the IVar and performed later, which often results in less parallelism than expected.

Sometimes partial strictness is more appropriate: see put_.

put_ :: IVar a -> a -> Par ()Source

like put, but only head-strict rather than fully-strict.

Operations

pval :: NFData a => a -> Par (IVar a)Source

equivalent to spawn . return

spawn :: NFData a => Par a -> Par (IVar a)Source

Like fork, but returns a IVar that can be used to query the result of the forked computataion.

  spawn p = do
    r <- new
    fork (p >>= put r)
    return r

spawn_ :: Par a -> Par (IVar a)Source

Like spawn, but the result is only head-strict, not fully-strict.

parMap :: (Traversable t, NFData b) => (a -> b) -> t a -> Par (t b)Source

Applies the given function to each element of a data structure in parallel (fully evaluating the results), and returns a new data structure containing the results.

 parMap f xs = mapM (pval . f) xs >>= mapM get

parMap is commonly used for lists, where it has this specialised type:

 parMap :: NFData b => (a -> b) -> [a] -> Par [b]

parMapM :: (Traversable t, NFData b) => (a -> Par b) -> t a -> Par (t b)Source

Like parMap, but the function is a Par monad operation.

 parMapM f xs = mapM (spawn . f) xs >>= mapM get

parMapReduceRangeThreshSource

Arguments

:: NFData a 
=> Int

threshold

-> InclusiveRange

range over which to calculate

-> (Int -> Par a)

compute one result

-> (a -> a -> Par a)

combine two results (associative)

-> a

initial result

-> Par a 

Computes a binary map/reduce over a finite range. The range is recursively split into two, the result for each half is computed in parallel, and then the two results are combined. When the range reaches the threshold size, the remaining elements of the range are computed sequentially.

For example, the following is a parallel implementation of

  foldl (+) 0 (map (^2) [1..10^6])
 parMapReduceRangeThresh 100 (InclusiveRange 1 (10^6))
        (\x -> return (x^2))
        (\x y -> return (x+y))
        0

parMapReduceRange :: NFData a => InclusiveRange -> (Int -> Par a) -> (a -> a -> Par a) -> a -> Par aSource

"Auto-partitioning" version of parMapReduceRangeThresh that chooses the threshold based on the size of the range and the number of processors..

parFor :: InclusiveRange -> (Int -> Par ()) -> Par ()Source

Parallel for-loop over an inclusive range. Semantically equivalent to

 parFor (InclusiveRange n m) f = forM_ [n..m] f

except that the implementation will split the work into an unspecified number of subtasks in an attempt to gain parallelism. The exact number of subtasks is chosen at runtime, and is probably a small multiple of the available number of processors.

Strictly speaking the semantics of parFor depends on the number of processors, and its behaviour is therefore not deterministic. However, a good rule of thumb is to not have any interdependencies between the elements; if this rule is followed then parFor has deterministic semantics. One easy way to follow this rule is to only use put or put_ in f, never get.