watchdog-0.3: Simple control structure to re-try an action with exponential backoff

Safe HaskellNone
LanguageHaskell98

Control.Watchdog

Description

How to use:

import Control.Watchdog
import Data.Time

errorProneTask :: IO (Either String ())
errorProneTask = do
    getCurrentTime >>= print
    return $ Left "some error"

main = watchdog $ watch errorProneTask

Result:

2012-07-09 21:48:19.592252 UTC
Watchdog: Error executing task (some error) - waiting 1s before trying again.
2012-07-09 21:48:20.594381 UTC
Watchdog: Error executing task (some error) - waiting 2s before trying again.
2012-07-09 21:48:22.597069 UTC
Watchdog: Error executing task (some error) - waiting 4s before trying again.
...

Alternatively the watchdog can stop after a certain number of attempts:

import Control.Watchdog
import Data.Time

errorProneTask :: IO (Either String ())
errorProneTask = do
    getCurrentTime >>= print
    return $ Left "some error"

main = do
    result <- watchdog $ do
        setMaximumRetries 2
        watchImpatiently errorProneTask
    print result

Result:

2012-07-09 21:55:41.046432 UTC
Watchdog: Error executing task (some error) - waiting 1s before trying again.
2012-07-09 21:55:42.047246 UTC
Watchdog: Error executing task (some error) - waiting 2s before trying again.
2012-07-09 21:55:44.049993 UTC
Left "some error"

The watchdog will execute the task and check the return value, which should be an Either value where Left signals an error and Right signals success.

The watchdog will backoff exponentially (up to a maximum delay) in case of persisting errors, but will reset after the task has been running for a while without problems (see setResetDuration) and start a new cycle of exponential backoff should new errors arise.

The module is intended to be used in different watchdog settings. For example to keep an eye on a server process (use watch and only return a succesful result when the server is doing a clean shutdown) or to retry an action multiple times, if necessary, before giving up (use watchImpatiently). A monadic approach is used to modify the various settings. Below is a code sample with all possible configuration options and their default values:

import Control.Watchdog
import Data.Time

errorProneTask :: IO (Either String ())
errorProneTask = do
    getCurrentTime >>= print
    return $ Left "some error"

main = watchdog $ do
        setInitialDelay $ 1 * 10^6      -- 1 second
        setMaximumDelay $ 300 * 10^6    -- 300 seconds
        setMaximumRetries 10            -- has no effect when using 'watch'
        setResetDuration $ 30 * 10^6    -- 30 seconds
        setLoggingAction defaultLogger
        watch errorProneTask

Synopsis

Documentation

watchdog :: WatchdogAction String a -> IO a Source #

The Watchdog monad. Used to configure and eventually run a watchdog.

watchdogBlank :: WatchdogAction e a -> IO a Source #

As with watchdog, but don't specify a default logging action to allow it to be polymorphic.

watch :: IO (Either e a) -> WatchdogAction e a Source #

Watch a task, restarting it potentially forever or until it returns with a result. The task should return an Either, where Left in combination with an error message signals an error and Right with an arbitrary result signals success.

watchImpatiently :: IO (Either e b) -> WatchdogAction e (Either e b) Source #

Watch a task, but only restart it a limited number of times (see setMaximumRetries). If the failure persists, it will be returned as a Left, otherwise it will be Right with the result of the task.

setInitialDelay :: Int -> WatchdogAction e () Source #

Set the initial delay in microseconds. The first time the watchdog pauses will be for this amount of time. The default is 1 second.

setMaximumDelay :: Int -> WatchdogAction e () Source #

Set the maximum delay in microseconds. When a task fails to execute properly multiple times in quick succession, the delay is doubled each time until it stays constant at the maximum delay. The default is 300 seconds.

setMaximumRetries :: Integer -> WatchdogAction e () Source #

Set the number of retries after which the watchdog will give up and return with a permanent error. This setting is only used in combination with watchImpatiently. The default is 10.

setResetDuration :: Int -> WatchdogAction e () Source #

If a task has been running for some time, the watchdog will consider the next failure to be something unrelated and reset the waiting time back to the initial delay. This function sets the amount of time in microseconds that needs to pass before the watchdog will consider a task to be successfully running. The default is 30 seconds.

setLoggingAction :: WatchdogLogger e -> WatchdogAction e () Source #

Set the logging action that will be called by the watchdog. The supplied function of type WatchdogLogger will be provided with the error message of the task and either Nothing if the watchdog will retry immediately or 'Just delay' if the watchdog will now pause for the specified amount of time before trying again. The default is defaultLogger.

defaultLogger :: WatchdogLogger String Source #

The default logging action. It will call formatWatchdogError and display the result on STDOUT.

showLogger :: Show e => WatchdogLogger e Source #

As with defaultLogger, but calls show on the value provided.

silentLogger :: WatchdogLogger e Source #

Disable logging by passing this function to setLoggingAction.

formatWatchdogError Source #

Arguments

:: (IsString str, Monoid str) 
=> str

Error message returned by the task.

-> Maybe Int

Waiting time - if any - before trying again.

-> str 

Format the watchdog status report. Will produce output like this:

Watchdog: Error executing task (some error) - trying again immediately.
Watchdog: Error executing task (some error) - waiting 1s before trying again.

type WatchdogLogger e Source #

Arguments

 = e

Error value returned by the task.

-> Maybe Int

Waiting time - if any - before trying again.

-> IO () 

Type synonym for a watchdog logger.