A couple of weeks ago, I was proper ill with flu; the problem with looking after your own server is that only you can fix it - it's well and good having monitoring systems (nagios) telling you about faults, but if you can't read or see the alerts the fault won't get resolved.
During this time I was ill, for an unknown reason the mySQL process on my server died, as such my website (and others I look after) were down for 8 hours. The fix was simple, one command, restart the service and normal service was resumed (excuse the pun).
This led to me to the conclusion that there must be a way to get the server to fix it's self. after all, why do a job when you can get a computer to do it for you ! Fortunately I had a light bulb moment and realised that I could use the init scripts that are provided by redhat, the below code will restart apache (httpd) and mySQL on a redhat based system in the event that the service was not stopped cleanly. (In-fact this config has only be tested on CentOS, your mileage may vary on anything else)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 | #!/bin/bash
# taken from redhast default scripts - /etc/rc.d/init.d/functions
# Set up a default search path.
PATH="/sbin:/usr/sbin:/bin:/usr/bin:/usr/X11R6/bin"
export PATH
status() {
local base=${1##*/}
local pid
# Test syntax.
if [ "$#" = 0 ] ; then
echo $"Usage: status {program}"
return 1
fi
# First try "pidof"
pid=`pidof -o $$ -o $PPID -o %PPID -x $1 ||
pidof -o $$ -o $PPID -o %PPID -x ${base}`
if [ -n "$pid" ]; then
# Uncomment this if you want OK messages
# echo $"${base} (pid $pid) is running..."
return 0
fi
# Next try "/var/run/*.pid" files
if [ -f /var/run/${base}.pid ] ; then
read pid < /var/run/${base}.pid
if [ -n "$pid" ]; then
echo $"${base} dead but pid file exists"
/etc/init.d/${base} restart
return 1
fi
fi
# See if /var/lock/subsys/${base} exists
if [ -f /var/lock/subsys/${base} ]; then
echo $"${base} dead but subsys locked"
/etc/init.d/${base} restart
return 2
fi
echo $"${base} is stopped"
return 3
}
# found in /etc/init.d/httpd
httpd=${HTTPD-/usr/sbin/httpd}
status mysqld
status $httpd
|
If you save this, as /etc/cron.hourly/auto_recovery.sh , then do chmod +x /etc/cron.hourly/auto_recovery.sh , assuming you've not changed the default cron setup, every hour mySQL & httpd will be checked, if they have died the'll be restarted and root will get an e-mail about what happened.
Cool eh !
A final finishing touch: I wanted to change the default "Database Down" error messages on my two most popular applications.
- Melvin Rivera has written a tutorial on how to customize the wordpress error page, note that it involves editing a file outside of wp-content, that means you'll have to re-do this "hack" every time you upgrade wordpress.
-
PHPBB: Setting a custom error page on that is really easy, first create a php page displaying your message. Then at the bottom of /path/to/phpbb-install/includes/db.php you'll see
// Make the database connection. $db = new sql_db($dbhost, $dbuser, $dbpasswd, $dbname, false); if(!$db->db_connect_id) { message_die(CRITICAL_ERROR, "Could not connect to the database"); }
change it to
// Make the database connection. $db = new sql_db($dbhost, $dbuser, $dbpasswd, $dbname, false); if(!$db->db_connect_id) { include("/path/to/my-custom-error-page.php"); die(); }
Now if you database dies, for the time it's down (before cron fixes it) wordpress & phpbb sites would get a much prettier error message. Obviously there's no solution for apache as there's nothing to serve the pages, but hopefully this kind of thing doesn't happen to often :D