I'm running a django instance behind nginx connected using fcgi (by using the manage.py runfcgi command). Since the code is loaded into memory I can't reload new code without killing and restarting the django fcgi processes, thus interrupting the live website. The restarting itself is very fast. But by killing the fcgi processes first some users' actions will get interrupted which is not good. I'm wondering how can I reload new code without ever causing any interruption. Advices will be highly appreciated!
I would start a new fcgi process on a new port, change the nginx configuration to use the new port, have nginx reload configuration (which in itself is graceful), then eventually stop the old process (you can use netstat to find out when the last connection to the old port is closed).
Alternatively, you can change the fcgi implementation to fork a new process, close all sockets in the child except for the fcgi server socket, close the fcgi server socket in parent, exec a new django process in the child (making it use the fcgi server socket), and terminate the parent process once all fcgi connections are closed. IOW, implement graceful restart for runfcgi.
You can use spawning instead of FastCGI
So I went ahead and implemented Martin's suggestion. Here is the bash script I came up with.
pid_file=/path/to/pidfile
port_file=/path/to/port_file
old_pid=`cat $pid_file`
if [[ -f $port_file ]]; then
last_port=`cat $port_file`
port_to_use=$(($last_port + 1))
else
port_to_use=8000
fi
# Reset so me don't go up forever
if [[ $port_to_use -gt 8999 ]]; then
port_to_use=8000
fi
sed -i "s/$old_port/$port_to_use/g" /path/to/nginx.conf
python manage.py runfcgi host=127.0.0.1 port=$port_to_use maxchildren=5 maxspare=5 minspare=2 method=prefork pidfile=$pid_file
echo $port_to_use > $port_file
kill -HUP `cat /var/run/nginx.pid`
echo "Sleeping for 5 seconds"
sleep 5s
echo "Killing old processes on $last_port, pid $old_pid"
kill $old_pid
I came across this page while looking for a solution for this problem. Everything else failed, so I looked in to the source code :)
The solution seems to be much simpler. Django fcgi server uses flup, which handles the HUP signal the proper way: it shuts down, gracefully. So all you have to do is to:
send the HUP signal to the fcgi server (the pidfile= argument of runserver will come in handy)
wait a bit (flup allows children processes 10 seconds, so wait a couple more; 15 looks like a good number)
sent the KILL signal to the fcgi server, just in case something blocked it
start the server again
That's it.