Angel

angel is a daemon that runs and monitors other processes.  It
is similar to djb's daemontools or the Ruby project god.
It's goals are to keep a set of services running, and to facilitate
the easy configuration and restart of those services.
Motivation
The author is a long-time user of daemontools due to its reliability
and simplicity; however, daemontools is quirky and follows many
unusual conventions.
angel is an attempt to recreate daemontools's capabilities (though
not the various bundled utility programs which are still quite useful)
in a more intuitive and modern unix style.
Functionality
angel is driven by a configuration file that contains a list of
program specifications to run.  angel assumes every program listed in
the specification file should be running at all times.
angel starts each program, and optionally sets the program's stdout
and stderr to some file(s) which have been opened in append mode
(or pipes stdout and stderr to some logger process); at
this point, the program is said to be "supervised".
If the program dies for any reason, angel waits a specified number
of seconds (default, 5), then restarts the program.
The angel process itself will respond to a HUP signal by
re-processing its configuration file, and synchronizing the run
states with the new configuration.  Specifically:
- If a new program has been added to the file, it is started and
supervised
- If a program's specification has changed (command line path,
stdin/stdout path, delay time, etc) that supervised child
process will be sent a TERM signal, and as a consequence of
normal supervision, will be restarted with the updated spec
- If a program has been removed from the configuration file,
the corresponding child process will be sent a TERM signal;
when it dies, supervision of the process will end, and
therefore, it will not be restarted
Safety and Reliability
Because of angel's role in policing the behavior of other
daemons, it has been written to be very reliable:
- It is written in Haskell, which boasts a combination of
strong, static typing and purity-by-default that lends
itself to very low bug counts
- It uses multiple, simple, independent lightweight threads
with specific roles, ownership, and interfaces
- It uses STM for mutex-free state synchronization between
these threads
- It falls back to polling behavior to ensure eventual
synchronization between configuration state and run
state, just in case odd timing issues should make
event-triggered changes fail
- It simply logs errors and keeps running the last good
configuration if it runs into problems on configuration
reloads
- It has logged hundreds of thousands of uptime-hours
since 2010-07 supervising all the daemons that power
http://bu.mp without a single memory leak or crash
Building
- Install the haskell-platform (or somehow, ghc 7.0 +
cabal-install)
- Run cabal installin the project root (this directory)
- Either add the ~/.cabal/bin file to your $PATH or copy
the angelexecutable to /usr/local/bin
Notes:
- I have not tried building angelagainst ghc 6.10 or earlier;
6.12, 7.0, 7.2, 7.4, and 7.6 are known to work
Testing
If you prefer to stick with haskell tools, use cabal to build the package.
If you have Ruby installed, I've set up a Rakefile for assisting in the
build/testing/sandboxing/dependency process. This isn't necessary to build or
test Angel, but it makes it easier. Run:
gem install bundler # if you don't have it already
bundle install
rake --tasks
If you're using cabal 0.17 or later, and I suggest you do, run
rake sandbox
Run the full test suite with
rake test
You can also use guard start which will watch for changes made to any source/test
files and re-run the tests for a rapid feedback cycle.
Configuration and Usage Example
The angel executable takes exactly one argument: a path to
an angel configuration file.
angel's configuration system is based on Bryan O'Sullivan's configurator
package.  A full description of the format can be found here:
http://hackage.haskell.org/packages/archive/configurator/0.1.0.0/doc/html/Data-Configurator.html
A basic configuration file might look like this:
watch-date {
    exec = "watch date"
}
ls {
    exec = "ls"
    stdout = "/tmp/ls_log"
    stderr = "/tmp/ls_log"
    delay = 7
}
workers {
    directory = "/path/to/worker"
    exec      = "run_worker"
    count     = 30
    pidfile   = "/path/to/pidfile.pid"
    env {
      FOO = "BAR"
      BAR = "BAZ"
    }
}
Each program that should be supervised starts a program-id block:
watch-date {
Then, a series of corresponding configuration commands follow:
- execis the exact command line to run (required)
- stdoutis a path to a file where the program's standard output
should be appended (optional, defaults to /dev/null)
- stderris a path to a file where the program's standard error
should be appended (optional, defaults to /dev/null)
- delayis the number of seconds (integer)- angelshould wait
after the program dies before attempting to start it again
(optional, defaults to 5)
- directoryis the current working directory of the newly
executed program (optional, defaults to angel's cwd)
- loggeris another process that should be launched to handle
logging.  The- execprocess will then have its stdout and stderr
piped into stdin of this logger.  Recommended log
rotation daemons include clog
or multilog. Note that
if you use a logger process, it is a configuration error
to specify either stdout or stderr as well.
- countis an optional argument to specify the number of processes to spawn.
For instance, if you specified a count of 2, it will spawn the program
twice, internally as- workers-1and- workers-2, for example. Note that- countwill inject the environment variable- ANGEL_PROCESS_NUMBERinto the
child process' environment variable.
- pidfileis an optional argument to specify where a pidfile should be
created. If you don't specify an absolute path, it will use the running
directory of angel. When combined with the- countoption, specifying a
pidfile of- worker.pid, it will generate- worker-1.pid,- worker-2.pid,
etc.
- envis a nested config of string key/value pairs. Non-string values are
invalid.
Assuming the above configuration was in a file called "example.conf",
here's what a shell session might look like:
jamie@choo:~/random/angel$ angel example.conf 
[2010/08/24 15:21:22] {main} Angel started
[2010/08/24 15:21:22] {main} Using config file: example.conf
[2010/08/24 15:21:22] {process-monitor} Must kill=0, must start=2
[2010/08/24 15:21:22] {- program: watch-date -} START
[2010/08/24 15:21:22] {- program: watch-date -} RUNNING
[2010/08/24 15:21:22] {- program: ls -} START
[2010/08/24 15:21:22] {- program: ls -} RUNNING
[2010/08/24 15:21:22] {- program: ls -} ENDED
[2010/08/24 15:21:22] {- program: ls -} WAITING
[2010/08/24 15:21:29] {- program: ls -} RESTART
[2010/08/24 15:21:29] {- program: ls -} START
[2010/08/24 15:21:29] {- program: ls -} RUNNING
[2010/08/24 15:21:29] {- program: ls -} ENDED
[2010/08/24 15:21:29] {- program: ls -} WAITING
.. etc
You can see that when the configuration is parsed, the process-monitor
notices that two programs need to be started.  A supervisor is started
in a lightweight thread for each, and starts logging with the context
program: <program-id>.
watch-date starts up and runs.  Since watch is a long-running process
it just keeps running in the background.
ls, meanwhile, runs and immediately ends, of course; then, the WAITING
state is entered until delay seconds pass.  Finally, the RESTART event
is triggered and it is started again, ad naseum.
Now, let's see what happens if we modify the config file to look like this:
#watch-date {
#    exec = "watch date"
#}
ls {
    exec = "ls"
    stdout = "/tmp/ls_log"
    stderr = "/tmp/ls_log"
    delay = 7
}
.. and then send HUP to angel.
[2010/08/24 15:33:59] {config-monitor} HUP caught, reloading config
[2010/08/24 15:33:59] {process-monitor} Must kill=1, must start=0
[2010/08/24 15:33:59] {- program: watch-date -} ENDED
[2010/08/24 15:33:59] {- program: watch-date -} QUIT
[2010/08/24 15:34:03] {- program: ls -} RESTART
[2010/08/24 15:34:03] {- program: ls -} START
[2010/08/24 15:34:03] {- program: ls -} RUNNING
[2010/08/24 15:34:03] {- program: ls -} ENDED
[2010/08/24 15:34:03] {- program: ls -} WAITING
As you can see, the config monitor reloaded on HUP, and then the
process monitor marked the watch-date process for killing.  TERM
was sent to the child process, and then the supervisor loop QUIT
because the watch-date program no longer had a config entry.
This also works for when you specify count. Incrementing/decrementing the count
will intelligently shut down excess processes and spin new ones up.
Advanced Configuration
The configurator package supports import statements, as
well as environment variable expansion.  Using collections
of configuration files and host-based or service-based
environment variables, efficient, templated angel
configurations can be had.
FAQ
Can I have multiple programs logging to the same file?
Yes, angel dup()s file descriptors and makes effort to safely
allow concurrent writes by child programs; you should DEFINITELY
make sure your child program is doing stdout/stderr writes in
line-buffered mode so this doesn't result in a complete interleaved
mess in the log file.
Will angel restart programs for me?
No; the design is just to send your programs TERM, then angel will
restart them.  angel tries to work in harmony with traditional
Unix process management conventions.
How can I take a service down without wiping out its configuration?
Specify a count of 0 for the process. That will kill any running processes
but still let you keep it in the config file.
CHANGELOG
0.5.0
- Drop depdendency on MissingH
0.4.4
- Add envoption to config.
- Inject ANGEL_PROCESS_NUMBERenvironment variable into processes started
withcount.
0.4.3
- Fix install failure from pidfile module not being accounted for.
0.4.2
- Add pidfileoption to program spec to specify a pidfile location.
0.4.1
- Add countoption to program spec to launch multiple instances of a program.
Author
Original Author: Jamie Turner jamie@jamwt.com
Current Maintainer: Michael Xavier michael@michaelxavier.net
Thanks to Bump Technologies, Inc. (http://bu.mp) for sponsoring some
of the work on angel.
And, of course, thanks to all Angel's contributors:
https://github.com/MichaelXavier/Angel/contributors