Watchdog

http://naviserver.sourceforge.net/ns_logobig.gif

MainSource Code Downloads API Documentation Mailing Lists Bugs Developers


Watchdog

Server instance controller process

NaviServer now has implemented a quite useful "watchdog" feature that allows the server to restart if it fails. In many situations you are now freed from setting up a special way of handling a server failure that would be solved with a simple restart.

Just start the server with the new "-w" switch.

Technically speaking the nsd process is forked twice and the first forked instance (the watchdog) controls the second (the worker). The first instance reacts on exit codes and signals caught during the watch and correspondingly restarts the second instance.

Restarting the Server

To restart the server from within your TCL code:

ns_shutdown -restart

To restart the server from the shell you can send the server the SIGINT.

Pro:

  • Builtin feature. No need to setup something else.
  • Ability to restart from within your application.

Con:

  • First step of implementation. You may need more options for the watchdog as it offers you now. Please tell us here [1 ]

Other approaches

There are several ways to keep your server running. Depending on your particular requirements the watchdog may not be the right thing for you. Common approaches are listed below.

Using init with /etc/inittab

Very easy to setup but with it's own limitations. init is the parent of all processes. It creates processes from the script /etc/inittab. Add a line like

ns1:345:respawn:/usr/local/ns/bin/nsd -i -u nsadmin -g users -t /usr/local/ns/config.tcl

and init does the job of restarting the server if it crashes.

Pro:

  • Very easy to configure.
  • Some kind of built in feature of your OS.

Con:

  • You need to be root to edit /etc/inittab and make changes.
  • If there's an error during startup init tries to restart the server and then waits for some time. Repeats endlessly.
  • There's no simple way to just stop the server if you want to as init tries to keep it up.

Using cron

As you already have (or may have) a rc-script for starting your server during boot time you simply could run a cronjob that checks the status of something like

rcnaviserver status || rcnaviserver start

as part of a script every 5 minutes by adding a crontab line (crontab -e) like

*/5 * * * *                /root/cronjobs/nsd_crontab

Pro:

  • If you wrap the status check in your own script you are more flexible in choosing the right action for the job, e.g. E-Mail notifications.
  • Works fine if you don't have a requirement of virtually no downtime.

Con:

  • Slightly more work if you have the situation to just stop the server for maintenance etc. (E.g. just restart if a certain status file does not exist)
  • You have to touch at least three files: the rc-script, crontab list and restart script.

Using daemontools

Daemontools is a collection of software to control other processes.

Pro:

  • Allows more users to control the process, e.g. users of a specific group.

Con:

  • Extra package, you need to compile it.
  • You need to be root.