watchdog-0.2.2.1: Simple control structure to re-try an action with exponential backoff

Safe HaskellNone

Control.Watchdog

Description

How to use:

 import Control.Watchdog
 import Data.Time

 errorProneTask :: IO (Either String ())
 errorProneTask = do
     getCurrentTime >>= print
     return $ Left "some error"

 main = watchdog $ watch errorProneTask

Result:

 2012-07-09 21:48:19.592252 UTC
 Watchdog: Error executing task (some error) - waiting 1s before trying again.
 2012-07-09 21:48:20.594381 UTC
 Watchdog: Error executing task (some error) - waiting 2s before trying again.
 2012-07-09 21:48:22.597069 UTC
 Watchdog: Error executing task (some error) - waiting 4s before trying again.
 ...

Alternatively the watchdog can stop after a certain number of attempts:

 import Control.Watchdog
 import Data.Time

 errorProneTask :: IO (Either String ())
 errorProneTask = do
     getCurrentTime >>= print
     return $ Left "some error"

 main = do
     result <- watchdog $ do
         setMaximumRetries 2
         watchImpatiently errorProneTask
     print result

Result:

 2012-07-09 21:55:41.046432 UTC
 Watchdog: Error executing task (some error) - waiting 1s before trying again.
 2012-07-09 21:55:42.047246 UTC
 Watchdog: Error executing task (some error) - waiting 2s before trying again.
 2012-07-09 21:55:44.049993 UTC
 Left "some error"

The watchdog will execute the task and check the return value, which should be an Either value where Left signals an error and Right signals success.

The watchdog will backoff exponentially (up to a maximum delay) in case of persisting errors, but will reset after the task has been running for a while without problems (see setResetDuration) and start a new cycle of exponential backoff should new errors arise.

The module is intended to be used in different watchdog settings. For example to keep an eye on a server process (use watch and only return a succesful result when the server is doing a clean shutdown) or to retry an action multiple times, if necessary, before giving up (use watchImpatiently). A monadic approach is used to modify the various settings. Below is a code sample with all possible configuration options and their default values:

 import Control.Watchdog
 import Data.Time

 errorProneTask :: IO (Either String ())
 errorProneTask = do
     getCurrentTime >>= print
     return $ Left "some error"

 main = watchdog $ do
         setInitialDelay $ 1 * 10^6      -- 1 second
         setMaximumDelay $ 300 * 10^6    -- 300 seconds
         setMaximumRetries 10            -- has no effect when using 'watch'
         setResetDuration $ 30 * 10^6    -- 30 seconds
         setLoggingAction defaultLogger
         watch errorProneTask

Synopsis

Documentation

watchdog :: WatchdogAction a -> IO aSource

The Watchdog monad. Used to configure and eventually run a watchdog.

watch :: IO (Either String a) -> WatchdogAction aSource

Watch a task, restarting it potentially forever or until it returns with a result. The task should return an Either, where Left in combination with an error message signals an error and Right with an arbitrary result signals success.

watchImpatiently :: IO (Either String b) -> WatchdogAction (Either String b)Source

Watch a task, but only restart it a limited number of times (see setMaximumRetries). If the failure persists, it will be returned as a Left, otherwise it will be Right with the result of the task.

setInitialDelay :: Int -> WatchdogAction ()Source

Set the initial delay in microseconds. The first time the watchdog pauses will be for this amount of time. The default is 1 second.

setMaximumDelay :: Int -> WatchdogAction ()Source

Set the maximum delay in microseconds. When a task fails to execute properly multiple times in quick succession, the delay is doubled each time until it stays constant at the maximum delay. The default is 300 seconds.

setMaximumRetries :: Integer -> WatchdogAction ()Source

Set the number of retries after which the watchdog will give up and return with a permanent error. This setting is only used in combination with watchImpatiently. The default is 10.

setResetDuration :: Int -> WatchdogAction ()Source

If a task has been running for some time, the watchdog will consider the next failure to be something unrelated and reset the waiting time back to the initial delay. This function sets the amount of time in microseconds that needs to pass before the watchdog will consider a task to be successfully running. The default is 30 seconds.

setLoggingAction :: WatchdogLogger -> WatchdogAction ()Source

Set the logging action that will be called by the watchdog. The supplied function of type WatchdogLogger will be provided with the error message of the task and either Nothing if the watchdog will retry immediately or 'Just delay' if the watchdog will now pause for the specified amount of time before trying again. The default is defaultLogger.

defaultLogger :: WatchdogLoggerSource

The default logging action. It will call formatWatchdogError and display the result on STDOUT.

silentLogger :: WatchdogLoggerSource

Disable logging by passing this function to setLoggingAction.

formatWatchdogErrorSource

Arguments

:: String

Error message returned by the task.

-> Maybe Int

Waiting time - if any - before trying again.

-> String 

Format the watchdog status report. Will produce output like this:

 Watchdog: Error executing task (some error) - trying again immediately.
 Watchdog: Error executing task (some error) - waiting 1s before trying again.

type WatchdogLoggerSource

Arguments

 = String

Error message returned by the task.

-> Maybe Int

Waiting time - if any - before trying again.

-> IO () 

Type synonym for a watchdog logger.