parallel-tasks-4.0.1.0

Safe HaskellNone

Control.Concurrent.ParallelTasks

Contents

Description

The parallel functions in this module all use the same underlying behaviour. You supply a list of tasks that you wish performed, either in the IO monad or some other MonadIO m => m monad. This library starts up a limited number of threads (by default, one per capability, i.e. one per available processor/core) and then executes the given work queue across the threads. This is better than simply starting all the jobs in parallel and waiting, because in the case where you have thousands or millions of jobs, but only say 16 cores, you do not want the overheads of switching between all those contending threads.

The default behaviour of these functions is to put useful progress reports onto stderr while it is running (number of tasks completed, estimate of final completion time). The library is aimed at millions of jobs taking several hours to complete; hence built-in output is very useful for you, while you wait. You can customise this behaviour by using the primed version of each of these functions and supplying a customised options record.

The only difference between the functions parallelList, parallelVec and parallelIOVec is the type of the results returned. The closest to the underlying behaviour is parallelIOVec'; the other functions are simply convenience wrappers that freeze/convert the IOVector into a Vector or list.

Note: make sure you compile your program with the -threaded -with-rtsopts=-N options (e.g. in the ghc-options field in your cabal file), or else you will not get any parallel execution in your program!

Synopsis

The main parallel processing functions.

parallelList :: [IO a] -> IO [a]Source

Runs the list of tasks in parallel (a few at a time), and returns the results in a list (with the corresponding order to the input list, i.e. the first task produces the first result in the list.) See the module description for more details.

Defined as: parallelList' defaultParTaskOpts

parallelVec :: [IO a] -> IO (Vector a)Source

As parallelList, but returns the results in an immutable Vector.

Defined as parallelVec' defaultParTaskOpts

parallelIOVec :: [IO a] -> IO (IOVector a)Source

As parallelList, but returns the results in a mutable IOVector.

Defined as parallelIOVec' defaultParTaskOpts

The configurable versions of the functions.

These versions can take place in a monad other than IO, and can configure other options (such as killing off long-running tasks). See the documentation for ParTaskOpts.

parallelList' :: MonadIO m => ParTaskOpts m a -> [m a] -> m [a]Source

parallelVec' :: MonadIO m => ParTaskOpts m a -> [m a] -> m (Vector a)Source

parallelIOVec' :: MonadIO m => ParTaskOpts m a -> [m a] -> m (IOVector a)Source

The options available to configure the functions.

data SimpleParTaskOpts Source

Constructors

SimpleParTaskOpts 

Fields

numberWorkers :: Maybe Int

Number of worker threads to use. When this is Nothing, defaults to number of capabilities (see numCapabilities)

printProgress :: Maybe Int

How often to print the progress of the tasks. E.g. when Just 100, print a message roughly after the completion of every 100 tasks.

printEstimate :: Maybe Int

How often to print an estimate of the estimated completion time. E.g. when Just 100, print an estimate after the completion of every 100 tasks.

data ParTaskOpts m a Source

Options controlling the general running of parallel tasks. The m parameter is the monad (which must be an instance of MonadIO) in which the tasks will be run, and the a parameter is the return value of the tasks.

Constructors

ParTaskOpts 

Fields

simpleOpts :: SimpleParTaskOpts

The simple options.

wrapWorker :: forall r. m (m r -> IO r)

Function to use to run the m monad on top of IO. The returned function is run at least once per worker, so should support being run multiple times in parallel, and should clean up after itself. Suitable instance for IO is simply return id.

timeLimit :: Maybe (Integer, a)

When Just, the number of microseconds to let each task run for, before assuming it will not complete, and killing it off. In the case that the task is killed off, the second part of the pair is the value that will be stored in the vector.

defaultParTaskOpts :: ParTaskOpts IO aSource

Default parallel task options. The number of workers defaults to the number of capabilities, with no time limit, and printing progress every 50 tasks and an estimated time every 200