Copyright	(c) Tim Watson 2012 - 2013
License	BSD3 (see the file LICENSE)
Maintainer	Tim Watson <watson.timothy@gmail.com>
Stability	experimental
Portability	non-portable (requires concurrency)
Safe Haskell	None
Language	Haskell98

Control.Distributed.Process.Supervisor

Contents

Defining and Running a Supervisor
Limits and Defaults
Adding and Removing Children
Normative Shutdown
Queries and Statistics
Additional (Misc) Types

Description

This module implements a process which supervises a set of other processes, referred to as its children. These child processes can be either workers (i.e., processes that do something useful in your application) or other supervisors. In this way, supervisors may be used to build a hierarchical process structure called a supervision tree, which provides a convenient structure for building fault tolerant software.

Unless otherwise stated, all functions in this module will cause the calling process to exit unless the specified supervisor process exists.

Supervision Principles

A supervisor is responsible for starting, stopping and monitoring its child processes so as to keep them alive by restarting them when necessary.

The supervisors children are defined as a list of child specifications (see ChildSpec). When a supervisor is started, its children are started in left-to-right (insertion order) according to this list. When a supervisor stops (or exits for any reason), it will terminate its children in reverse (i.e., from right-to-left of insertion) order. Child specs can be added to the supervisor after it has started, either on the left or right of the existing list of children.

When the supervisor spawns its child processes, they are always linked to their parent (i.e., the supervisor), therefore even if the supervisor is terminated abruptly by an asynchronous exception, the children will still be taken down with it, though somewhat less ceremoniously in that case.

Restart Strategies

Supervisors are initialised with a RestartStrategy, which describes how the supervisor should respond to a child that exits and should be restarted (see below for the rules governing child restart eligibility). Each restart strategy comprises a RestartMode and RestartLimit, which govern how the restart should be handled, and the point at which the supervisor should give up and terminate itself respectively.

With the exception of the RestartOne strategy, which indicates that the supervisor will restart only the one individual failing child, each strategy describes a way to select the set of children that should be restarted if any child fails. The RestartAll strategy, as its name suggests, selects all children, whilst the RestartLeft and RestartRight strategies select all children to the left or right of the failed child, in insertion (i.e., startup) order.

Note that a branch restart will only occur if the child that exited is meant to be restarted. Since Temporary children are never restarted and Transient children are not restarted if they exit normally, in both these circumstances we leave the remaining supervised children alone. Otherwise, the failing child is always included in the branch to be restarted.

For a hypothetical set of children a through d, the following pseudocode demonstrates how the restart strategies work.

let children = [a..d]
let failure = c
restartsFor RestartOne   children failure = [c]
restartsFor RestartAll   children failure = [a,b,c,d]
restartsFor RestartLeft  children failure = [a,b,c]
restartsFor RestartRight children failure = [c,d]

Branch Restarts

We refer to a restart (strategy) that involves a set of children as a branch restart from now on. The behaviour of branch restarts can be further refined by the RestartMode with which a RestartStrategy is parameterised. The RestartEach mode treats each child sequentially, first stopping the respective child process and then restarting it. Each child is stopped and started fully before moving on to the next, as the following imaginary example demonstrates for children [a,b,c]:

stop  a
start a
stop  b
start b
stop  c
start c

By contrast, RestartInOrder will first run through the selected list of children, stopping them. Then, once all the children have been stopped, it will make a second pass, to handle (re)starting them. No child is started until all children have been stopped, as the following imaginary example demonstrates:

stop  a
stop  b
stop  c
start a
start b
start c

Both the previous examples have shown children being stopped and started from left to right, but that is up to the user. The RestartMode data type's constructors take a RestartOrder, which determines whether the selected children will be processed from LeftToRight or RightToLeft.

Sometimes it is desireable to stop children in one order and start them in the opposite. This is typically the case when children are in some way dependent on one another, such that restarting them in the wrong order might cause the system to misbehave. For this scenarios, there is another RestartMode that will shut children down in the given order, but then restarts them in the reverse. Using RestartRevOrder mode, if we have children [a,b,c] such that b depends on a and c on b, we can stop them in the reverse of their startup order, but restart them the other way around like so:

RestartRevOrder RightToLeft

The effect will be thus:

stop  c
stop  b
stop  a
start a
start b
start c

Restart Intensity Limits

If a child process repeatedly crashes during (or shortly after) starting, it is possible for the supervisor to get stuck in an endless loop of restarts. In order prevent this, each restart strategy is parameterised with a RestartLimit that caps the number of restarts allowed within a specific time period. If the supervisor exceeds this limit, it will stop, terminating all its children (in left-to-right order) and exit with the reason ExitOther ReachedMaxRestartIntensity.

The MaxRestarts type is a positive integer, and together with a specified TimeInterval forms the RestartLimit to which the supervisor will adhere. Since a great many children can be restarted in close succession when a branch restart occurs (as a result of RestartAll, RestartLeft or RestartRight being triggered), the supervisor will track the operation as a single restart attempt, since otherwise it would likely exceed its maximum restart intensity too quickly.

Child Restart and Termination Policies

When the supervisor detects that a child has died, the RestartPolicy configured in the child specification is used to determin what to do. If the this is set to Permanent, then the child is always restarted. If it is Temporary, then the child is never restarted and the child specification is removed from the supervisor. A Transient child will be restarted only if it terminates abnormally, otherwise it is left inactive (but its specification is left in place). Finally, an Intrinsic child is treated like a Transient one, except that if this kind of child exits normally, then the supervisor will also exit normally.

When the supervisor does terminate a child, the ChildTerminationPolicy provided with the ChildSpec determines how the supervisor should go about doing so. If this is TerminateImmediately, then the child will be killed without further notice, which means the child will not have an opportunity to clean up any internal state and/or release any held resources. If the policy is TerminateTimeout delay however, the child will be sent an exit signal instead, i.e., the supervisor will cause the child to exit via exit childPid ExitShutdown, and then will wait until the given delay for the child to exit normally. If this does not happen within the given delay, the supervisor will revert to the more aggressive TerminateImmediately policy and try again. Any errors that occur during a timed-out shutdown will be logged, however exit reasons resulting from TerminateImmediately are ignored.

Creating Child Specs

The ToChildStart typeclass simplifies the process of defining a ChildStart providing three default instances from which a ChildStart datum can be generated. The first, takes a Closure (Process ()), where the enclosed action (in the Process monad) is the actual (long running) code that we wish to supervise. In the case of a managed process, this is usually the server loop, constructed by evaluating some variant of ManagedProcess.serve.

The other two instances provide a means for starting children without having to provide a Closure. Both instances wrap the supplied Process action in some necessary boilerplate code, which handles spawning a new process and communicating its ProcessId to the supervisor. The instance for Addressable a => SupervisorPid -> Process a is special however, since this API is intended for uses where the typical interactions with a process take place via an opaque handle, for which an instance of the Addressable typeclass is provided. This latter approach requires the expression which is responsible for yielding the Addressable handle to handling linking the target process with the supervisor, since we have delegated responsibility for spawning the new process and cannot perform the link oepration ourselves.

Supervision Trees & Supervisor Termination

To create a supervision tree, one simply adds supervisors below one another as children, setting the childType field of their ChildSpec to Supervisor instead of Worker. Supervision tree can be arbitrarilly deep, and it is for this reason that we recommend giving a Supervisor child an arbitrary length of time to stop, by setting the delay to Infinity or a very large TimeInterval.

Synopsis

Defining and Running a Supervisor

data ChildSpec Source

Specification for a child process. The child must be uniquely identified by it's childKey within the supervisor. The supervisor will start the child itself, therefore childRun should contain the child process' implementation e.g., if the child is a long running server, this would be the server loop, as with e.g., ManagedProces.start.

Constructors

ChildSpec
Fields childKey :: !ChildKey childType :: !ChildType childRestart :: !RestartPolicy childStop :: !ChildTerminationPolicy childStart :: !ChildStart childRegName :: !(Maybe RegisteredName)

Instances

Show ChildSpec Source
Generic ChildSpec Source
Binary ChildSpec Source
NFData ChildSpec Source
type Rep ChildSpec Source

type ChildKey = String Source

Identifies a child process by name.

data ChildType Source

Specifies whether the child is another supervisor, or a worker.

Constructors

Worker
Supervisor

Instances

Eq ChildType Source
Show ChildType Source
Generic ChildType Source
Binary ChildType Source
NFData ChildType Source
type Rep ChildType Source

data ChildStart Source

Constructors

RunClosure !(Closure (Process ()))
CreateHandle !(Closure (SupervisorPid -> Process (ChildPid, Message)))
StarterProcess !StarterPid

Instances

Show ChildStart Source
Generic ChildStart Source
Binary ChildStart Source
NFData ChildStart Source
type Rep ChildStart Source

data RegisteredName Source

Constructors

LocalName !String
CustomRegister !(Closure (ChildPid -> Process ()))

Instances

Show RegisteredName Source
Generic RegisteredName Source
Binary RegisteredName Source
NFData RegisteredName Source
type Rep RegisteredName Source

data RestartPolicy Source

Describes when a terminated child process should be restarted.

Constructors

Permanent	a permanent child will always be restarted
Temporary	a temporary child will never be restarted
Transient	A transient child will be restarted only if it terminates abnormally
Intrinsic	as `Transient`, but if the child exits normally, the supervisor also exits normally

Instances

Eq RestartPolicy Source
Show RestartPolicy Source
Generic RestartPolicy Source
Binary RestartPolicy Source
NFData RestartPolicy Source
type Rep RestartPolicy Source

data ChildRef Source

A reference to a (possibly running) child.

Constructors

ChildRunning !ChildPid	a reference to the (currently running) child
ChildRunningExtra !ChildPid !Message	also a currently running child, with extra child info
ChildRestarting !ChildPid	a reference to the old (previous) child (now restarting)
ChildStopped	indicates the child is not currently running
ChildStartIgnored	a non-temporary child exited with `ChildInitIgnore`

isRunning :: ChildRef -> Bool Source

isRestarting :: ChildRef -> Bool Source

type Child = (ChildRef, ChildSpec) Source

type StaticLabel = String Source

Static labels (in the remote table) are strings.

type SupervisorPid = ProcessId Source

type ChildPid = ProcessId Source

type StarterPid = ProcessId Source

class ToChildStart a where Source

A type that can be converted to a ChildStart.

Methods

toChildStart :: a -> Process ChildStart Source

Instances

ToChildStart (Process ()) Source
ToChildStart (Closure (SupervisorPid -> Process (ChildPid, Message))) Source
ToChildStart (Closure (Process ())) Source
Resolvable a => ToChildStart (SupervisorPid -> Process a) Source

start :: RestartStrategy -> ShutdownMode -> [ChildSpec] -> Process SupervisorPid Source

Start a supervisor (process), running the supplied children and restart strategy.

start = spawnLocal . run

run :: RestartStrategy -> ShutdownMode -> [ChildSpec] -> Process () Source

Run the supplied children using the provided restart strategy.

Limits and Defaults

data MaxRestarts Source

Instances

Show MaxRestarts Source
Generic MaxRestarts Source
Binary MaxRestarts Source
NFData MaxRestarts Source
type Rep MaxRestarts Source

maxRestarts :: Int -> MaxRestarts Source

Smart constructor for MaxRestarts. The maximum restart count must be a positive integer.

data RestartLimit Source

A compulsary limit on the number of restarts that a supervisor will tolerate before it terminates all child processes and then itself. If > MaxRestarts occur within the specified TimeInterval, termination will occur. This prevents the supervisor from entering an infinite loop of child process terminations and restarts.

Constructors

RestartLimit
Fields maxR :: !MaxRestarts maxT :: !TimeInterval

Instances

Show RestartLimit Source
Generic RestartLimit Source
Binary RestartLimit Source
NFData RestartLimit Source
type Rep RestartLimit Source

limit :: MaxRestarts -> TimeInterval -> RestartLimit Source

defaultLimits :: RestartLimit Source

data RestartMode Source

Constructors

RestartEach	stop then start each child sequentially, i.e., `foldlM stopThenStart children`
Fields order :: !RestartOrder
RestartInOrder	stop all children first, then restart them sequentially
Fields order :: !RestartOrder
RestartRevOrder	stop all children in the given order, but start them in reverse
Fields order :: !RestartOrder

Instances

Eq RestartMode Source
Show RestartMode Source
Generic RestartMode Source
Binary RestartMode Source
NFData RestartMode Source
type Rep RestartMode Source

data RestartOrder Source

Constructors

LeftToRight
RightToLeft

Instances

Eq RestartOrder Source
Show RestartOrder Source
Generic RestartOrder Source
Binary RestartOrder Source
NFData RestartOrder Source
type Rep RestartOrder Source

data RestartStrategy Source

Strategy used by a supervisor to handle child restarts, whether due to unexpected child failure or explicit restart requests from a client.

Some terminology: We refer to child processes managed by the same supervisor as siblings. When restarting a child process, the RestartNone policy indicates that sibling processes should be left alone, whilst the RestartAll policy will cause all children to be restarted (in the same order they were started).

The other two restart strategies refer to prior and subsequent siblings, which describe's those children's configured position (i.e., insertion order). These latter modes allow one to control the order in which siblings are restarted, and to exclude some siblings from the restart without having to resort to grouping them using a child supervisor.

Constructors

RestartOne	restart only the failed child process
Fields intensity :: !RestartLimit
RestartAll	also restart all siblings
Fields intensity :: !RestartLimit mode :: !RestartMode
RestartLeft	restart prior siblings (i.e., prior start order)
Fields intensity :: !RestartLimit mode :: !RestartMode
RestartRight	restart subsequent siblings (i.e., subsequent start order)
Fields intensity :: !RestartLimit mode :: !RestartMode

Instances

Show RestartStrategy Source
Generic RestartStrategy Source
Binary RestartStrategy Source
NFData RestartStrategy Source
type Rep RestartStrategy Source

data ShutdownMode Source

Constructors

SequentialShutdown !RestartOrder
ParallelShutdown

Instances

Eq ShutdownMode Source
Show ShutdownMode Source
Generic ShutdownMode Source
Binary ShutdownMode Source
NFData ShutdownMode Source
type Rep ShutdownMode Source

restartOne :: RestartStrategy Source

Provides a default RestartStrategy for RestartOne. > restartOne = RestartOne defaultLimits

restartAll :: RestartStrategy Source

Provides a default RestartStrategy for RestartAll. > restartOne = RestartAll defaultLimits (RestartEach LeftToRight)

restartLeft :: RestartStrategy Source

Provides a default RestartStrategy for RestartLeft. > restartOne = RestartLeft defaultLimits (RestartEach LeftToRight)

restartRight :: RestartStrategy Source

Provides a default RestartStrategy for RestartRight. > restartOne = RestartRight defaultLimits (RestartEach LeftToRight)

Adding and Removing Children

addChild :: Addressable a => a -> ChildSpec -> Process AddChildResult Source

Add a new child.

data AddChildResult Source

Constructors

ChildAdded !ChildRef
ChildFailedToStart !StartFailure

Instances

Eq AddChildResult Source
Show AddChildResult Source
Generic AddChildResult Source
Binary AddChildResult Source
NFData AddChildResult Source
type Rep AddChildResult Source

startChild :: Addressable a => a -> ChildKey -> Process StartChildResult Source

Start an existing (configured) child. The ChildSpec must already be present (see addChild), otherwise the operation will fail.

startNewChild :: Addressable a => a -> ChildSpec -> Process AddChildResult Source

Atomically add and start a new child spec. Will fail if a child with the given key is already present.

terminateChild :: Addressable a => a -> ChildKey -> Process TerminateChildResult Source

Terminate a running child.

deleteChild :: Addressable a => a -> ChildKey -> Process DeleteChildResult Source

Delete a supervised child. The child must already be stopped (see terminateChild).

data DeleteChildResult Source

The result of a call to removeChild.

Constructors

ChildDeleted	the child specification was successfully removed
ChildNotFound	the child specification was not found
ChildNotStopped !ChildRef	the child was not removed, as it was not stopped.

Instances

Eq DeleteChildResult Source
Show DeleteChildResult Source
Generic DeleteChildResult Source
Binary DeleteChildResult Source
NFData DeleteChildResult Source
type Rep DeleteChildResult Source

restartChild :: Addressable a => a -> ChildKey -> Process RestartChildResult Source

Forcibly restart a running child.

Normative Shutdown

shutdown :: Resolvable a => a -> Process () Source

Gracefully terminate a running supervisor. Returns immediately if the address cannot be resolved.

shutdownAndWait :: Resolvable a => a -> Process () Source

As shutdown, but waits until the supervisor process has exited, at which point the caller can be sure that all children have also stopped. Returns immediately if the address cannot be resolved.

Queries and Statistics

lookupChild :: Addressable a => a -> ChildKey -> Process (Maybe (ChildRef, ChildSpec)) Source

Lookup a possibly supervised child, given its ChildKey.

listChildren :: Addressable a => a -> Process [Child] Source

List all know (i.e., configured) children.

data SupervisorStats Source

Constructors

SupervisorStats
Fields _children :: Int _supervisors :: Int _workers :: Int _running :: Int _activeSupervisors :: Int _activeWorkers :: Int totalRestarts :: Int

Instances

Show SupervisorStats Source
Generic SupervisorStats Source
Binary SupervisorStats Source
NFData SupervisorStats Source
type Rep SupervisorStats Source

statistics :: Addressable a => a -> Process SupervisorStats Source

Obtain statistics about a running supervisor.

Additional (Misc) Types

data StartFailure Source

Provides failure information when (re-)start failure is indicated.

Constructors

StartFailureDuplicateChild !ChildRef	a child with this `ChildKey` already exists
StartFailureAlreadyRunning !ChildRef	the child is already up and running
StartFailureBadClosure !StaticLabel	a closure cannot be resolved
StartFailureDied !DiedReason	a child died (almost) immediately on starting

Instances

Eq StartFailure Source
Show StartFailure Source
Generic StartFailure Source
Binary StartFailure Source
NFData StartFailure Source
type Rep StartFailure Source

data ChildInitFailure Source

Constructors

ChildInitFailure !String
ChildInitIgnore

Instances

Show ChildInitFailure Source
Generic ChildInitFailure Source
Exception ChildInitFailure Source
type Rep ChildInitFailure Source

Eq ChildTerminationPolicy Source
Show ChildTerminationPolicy Source
Generic ChildTerminationPolicy Source
Binary ChildTerminationPolicy Source
NFData ChildTerminationPolicy Source
type Rep ChildTerminationPolicy Source

Eq ChildRef Source
Show ChildRef Source
Generic ChildRef Source
Binary ChildRef Source
NFData ChildRef Source
Resolvable ChildRef Source
Routable ChildRef Source
type Rep ChildRef Source

ChildStartOk !ChildRef
ChildStartFailed !StartFailure
ChildStartUnknownId
ChildStartInitIgnored

Eq StartChildResult Source
Show StartChildResult Source
Generic StartChildResult Source
Binary StartChildResult Source
NFData StartChildResult Source
type Rep StartChildResult Source

Eq TerminateChildResult Source
Show TerminateChildResult Source
Generic TerminateChildResult Source
Binary TerminateChildResult Source
NFData TerminateChildResult Source
type Rep TerminateChildResult Source

ChildRestartOk !ChildRef
ChildRestartFailed !StartFailure
ChildRestartUnknownId
ChildRestartIgnored

Eq RestartChildResult Source
Show RestartChildResult Source
Generic RestartChildResult Source
Binary RestartChildResult Source
NFData RestartChildResult Source
type Rep RestartChildResult Source