I ran into the same issue with EC2 instances, but addressed it in a different way -- instead of monitoring the instances, I had them automatically kill themselves after a set amount of time. From your description, it sounds like this may not be practical in your environment, but I thought I would share just in case it helps. My AMI was Fedora-based, so I created the following bash script, registered it as a service, and had it run at startup:
#!/bin/bash
# chkconfig: 2345 68 20
# description: 50 Minute Kill
# Source Functions
. /etc/rc.d/init.d/functions
start()
{
# Shut down 50 minutes after starting up
at now + 50 minutes < /root/atshutdown
}
stop()
{
# Remove all jobs from the at queue because I'm not using at for anything else
for job in $(atq | awk '{print $1}')
do
atrm $job
done
}
case "$1" in
start)
start && success || failure
echo
;;
stop)
stop && success || failure
echo
;;
restart)
stop && start && success || failure
echo
;;
status)
echo $"`atq`"
;;
*)
echo $"Usage: $0 {start | stop | restart}"
RETVAL=1
esac
exit $RETVAL
You might consider doing something similar to suit your needs. If you do this, be especially careful that you stop the service before modifying your image so that the instance does not shutdown before you get a chance to re-bundle.
If you wanted, you could have the instances shutdown at a fixed time (after everyone leaves work?), or you could pass in a keep-alive length/shutdown time via the -d
or -f
parameters to ec2-run-instances
and parse it out into the script.