remote-0.1: Cloud Haskell

Remote.Task

Contents

Description

This module provides data dependency resolution and fault tolerance via promises (known elsewhere as futures). It's implemented in terms of the Remote.Process module.

Synopsis

Tasks and promises

data Promise a Source

The basic data type for expressing data dependence in the TaskM monad. A Promise represents a value that may or may not have been computed yet; thus, it's like a distributed thunk (in the sense of a non-strict unit of evaluation). These are created by newPromise and friends, and the underlying value can be gotten with readPromise.

runTask :: TaskM a -> ProcessM aSource

Starts a new context for executing a TaskM environment. The node on which this function is run becomes a new master in a Task application; as a result, the application should only call this function once. The master will attempt to control all nodes that it can find; if you are going to be running more than one CH application on a single network, be sure to give each application a different network magic (via cfgNetworkMagic). The master TaskM environment created by this function can then spawn other threads, locally or remotely, using newPromise and friends.

newPromise :: Serializable a => Closure (TaskM a) -> TaskM (Promise a)Source

Given a function (expressed here as a closure, see Remote.Call) that computes a value, returns a token identifying that value. This token, a Promise can be moved about even if the value hasn't been computed yet. The computing function will be started somewhere among the nodes visible to the current master, preferring those nodes that correspond to the defaultLocality. Afterwards, attempts to redeem the promise with readPromise will contact the node where the function is executing.

newPromiseAt :: Serializable a => Locality -> Closure (TaskM a) -> TaskM (Promise a)Source

A variant of newPromise that lets the user specify a Locality. The other flavors of newPromise, such as newPromiseAtRole, newPromiseNear, and newPromiseHere at just shorthand for a call to this function.

newPromiseNear :: (Serializable a, Serializable b) => Promise b -> Closure (TaskM a) -> TaskM (Promise a)Source

A variant of newPromise that prefers to start the computing function on the same node where some other promise lives. The other promise is not evaluated.

newPromiseHere :: Serializable a => Closure (TaskM a) -> TaskM (Promise a)Source

A variant of newPromise that prefers to start the computing function on the same node as the caller. Useful if you plan to use the resulting value locally.

newPromiseAtRole :: Serializable a => String -> Closure (TaskM a) -> TaskM (Promise a)Source

A variant of newPromise that prefers to start the computing functions on some set of nodes that have a given role (assigned by the cfgRole configuration option).

toPromise :: Serializable a => a -> TaskM (Promise a)Source

Like newPromise, but creates a promise whose values is already known. In other words, it puts a given, already-calculated value in a promise. Conceptually (but not syntactically, due to closures), you can consider it like this:

 toPromise a = newPromise (return a)

toPromiseAt :: Serializable a => Locality -> a -> TaskM (Promise a)Source

A variant of toPromise that lets the user express a locality preference, i.e. some information about which node will become the owner of the new promise. These preferences will not necessarily be respected.

toPromiseImm :: Serializable a => a -> TaskM (Promise a)Source

Creates an immediate promise, which is to say, a promise in name only. Unlike a regular promise (created by toPromise), this kind of promise contains the value directly. The advantage is that promise redemption is very fast, requiring no network communication. The downside is that it the underlying data will be copied along with the promise. Useful only for small data.

readPromise :: Serializable a => Promise a -> TaskM aSource

Given a promise, gets the value that is being calculated. If the calculation has finished, the owning node will be contacted and the data moved to the current node. If the calculation has not finished, this function will block until it has. If the calculation failed by throwing an exception (e.g. divide by zero), then this function will throw an excption as well (a TaskException). If the node owning the promise is not accessible, the calculation will be restarted.

MapReduce

data MapReduce rawinput input middle1 middle2 result Source

A data structure that stores the important user-provided functions that are the namesakes of the MapReduce algorithm. The number of mapper processes can be controlled by the user by controlling the length of the string returned by mtChunkify. The number of reducer promises is controlled by the number of values values returned by shuffler. The user must provide their own mapper and reducer. For many cases, the default chunkifier (chunkify) and shuffler (shuffle) are adequate.

Constructors

MapReduce 

Fields

mtMapper :: input -> Closure (TaskM [middle1])
 
mtReducer :: middle2 -> Closure (TaskM result)
 
mtChunkify :: rawinput -> [input]
 
mtShuffle :: [middle1] -> [middle2]
 

mapReduce :: (Serializable i, Serializable k, Serializable m, Serializable r) => MapReduce ri i k m r -> ri -> TaskM [r]Source

The MapReduce algorithm, implemented in a very simple form on top of the Task layer. Its use depends on four user-determined data types:

  • input -- The data type provided as the input to the algorithm as a whole and given to the mapper.
  • middle1 -- The output of the mapper. This may include some key which is used by the shuffler to allocate data to reducers. If you use the default shuffler, shuffle, this type must have the form Ord a => (a,b).
  • middle2 -- The output of the shuffler. The default shuffler emits a type in the form Ord => (a,[b]). Each middle2 output by shuffler is given to a separate reducer.
  • result -- The output of the reducer, upon being given a bunch of middles.

Useful auxilliaries

chunkify :: Int -> [a] -> [[a]]Source

A convenient way to provide the mtChunkify function as part of mapReduce.

shuffle :: Ord a => [(a, b)] -> [(a, [b])]Source

A convenient way to provide the mtShuffle function as part of mapReduce.

tsay :: String -> TaskM ()Source

A Task-monadic version of say. Puts text messages in the log.

tlogS :: LogSphere -> LogLevel -> String -> TaskM ()Source

Writes various kinds of messages to the Remote.Process log.

data Locality Source

A specification of preference of where a promise should be allocated, among the nodes visible to the master.

Constructors

LcUnrestricted

The promise can be placed anywhere.

LcDefault

The default preference is applied, which is for nodes having a role of NODE of WORKER

LcByRole [String]

Nodes having the given roles will be preferred

LcByNode [NodeId]

The given nodes will be preferred

Instances

Internals, not for general use