views:

338

answers:

3

We currently run a small shared hosting service for a couple of hundred small PHP sites on our servers. We'd like to offer Python support too, but from our initial research at least, a server restart seems to be required after each source code change.

Is this really the case? If so, we're just not going to be able to offer Python hosting support. Giving our clients the ability to upload files is easy, but we can't have them restart the (shared) server process!

PHP is easy -- you upload a new version of a file, the new version is run.

I've a lot of respect for the Python language and community, so find it hard to believe that it really requires such a crazy process to update a site's code. Please tell me I'm wrong! :-)

+4  A: 

Depends on how you deploy the Python application. If it is as a pure Python CGI script, no restarts are necessary (not advised at all though, because it will be super slow). If you are using modwsgi in Apache, there are valid ways of reloading the source. modpython apparently has some support and accompanying issues for module reloading.

There are ways other than Apache to host Python application, including the CherryPy server, Paste Server, Zope, Twisted, and Tornado.

However, unless you have a specific reason not to use it (an since you are coming from presumably an Apache/PHP shop), I would highly recommed mod_wsgi on Apache. I know that Django recommends modwsgi on Apache and most of the other major Python frameworks will work on modwsgi.

John Paulett
Thanks for the answer John. You've given some nice pointers to follow up on, but haven't really answered the real question:Why are server restarts required at all?They are not in PHP, but are in Python (CGI aside). I find it confusing, as they share the same basic execution model otherwise...
Leon
They don't share the same basic execution model. PHP is file based. Apache/mod_php reads a file and interpets the PHP code on every single request. Contrast that to Python apps, which (depending on deployment) are typically long running processes that exist even when there are no requests. You don't need to restart the server under Apache mod_wsgi. You just touch the .wsgi file and the Python process(es) will get reloaded.
Brian Neal
It's worth noting that `mod_wsgi`'s "embedded" mode, in addition to having performance and reliability issues, prevents the proper operation of code reloading. I *always* configure my servers in daemon mode, and from what I've heard from others, changing to daemon mode fixes many problems with `mod_wsgi`.
John Millikin
Good tip, thanks John.
Leon
John, embedded mode of mod_wsgi does not have performance and reliability issues. The mod_wsgi module itself works perfectly fine as will any Python web application run on top of it, provided that you configure Apache correctly. So, nothing to do with mod_wsgi but due to people not configuring Apache correctly when embedded mode is used. Read 'http://blog.dscpl.com.au/2009/03/load-spikes-and-excessive-memory-usage.html'. It is a manageable issue if you do your homework properly.
Graham Dumpleton
Brian, touching the .wsgi file only works for daemon mode of mod_wsgi. See 'http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode'.
Graham Dumpleton
John, the ability of mod_python to reload code files only applies to mod_python specific handlers and not all Python modules. Thus in mod_python, if running Django or other non mod_python specific application or framework, you still need to restart whole Apache server for code in standard Python modules/packages to be reloaded.
Graham Dumpleton
@Graham: I'm talking about mod_wsgi, not mod_python. mod_wsgi's embedded mode is slow, sometimes crashes, and doesn't properly support reloading. Though, mod_python is even worse -- in my experience, changing from mod_python to mod_wsgi can almost double supported connections per second.
John Millikin
@Graham, yes, I failed to mention that works only in daemon mode. Thanks.
Brian Neal
+5  A: 

Python is a compiled language; the compiled byte code is cached by the Python process for later use, to improve performance. PHP, by default, is interpreted. It's a tradeoff between usability and speed.

If you're using a standard WSGI module, such as Apache's mod_wsgi, then you don't have to restart the server -- just touch the .wsgi file and the code will be reloaded. If you're using some weird server which doesn't support WSGI, you're sort of on your own usability-wise.

John Millikin
Actually both languages are compiled to byte code, then the byte code is interpreted in both cases, although it is true that the CPython implementation caches the bytecode for faster startup. PHP has Zend Accelerator and APC which do something very similar.My reading seems to suggest that it's not as simple as simply touching the .wsgi file, but I'll investigate further, thanks!
Leon
I use mod_wsgi and Apache, and it really is just that simple. `touch /path/to/site.wsgi`, and the new code goes live. Regarding PHP, it's my understanding that the "main" implementation is an interpreter, with commercial compiler-based implementations available -- is this correct? outdated?
John Millikin
Good to know, thanks John. Does that reloading behaviour also apply to the modules included into the main .wsgi script? (See bibince's answer)
Leon
Yes; when the WSGI child process is reloaded, any modules which have been modified will be reloaded.
John Millikin
Repeating comment from elsewhere, touching the .wsgi file only works for daemon mode of mod_wsgi. See 'http://code.google.com/p/modwsgi/wiki/ReloadingSourceCode'.
Graham Dumpleton
Leon, in PHP, the whole world is built on every request, then torn down. In other words, the interpreter returns to a fresh state after every request. That isn't typically how you run a Python web app. Python apps are long running processes that persist between requests.
Brian Neal
+1  A: 

Is this really the case?

It Depends. Code reloading is highly specific to the hosting solution. Most servers provide some way to automatically reload the WSGI script itself, but there's no standardisation; indeed, the question of how a WSGI Application object is connected to a web server at all differs widely across varying hosting environments. (You can just about make a single script file that works as deployment glue for CGI, mod_wsgi, passenger and ISAPI_WSGI, but it's not wholly trivial.)

What Python really struggles with, though, is module reloading. Which is problematic for WSGI applications because any non-trivial webapp will be encapsulating its functionality into modules and packages rather than simple standalone scripts. It turns out reloading modules is quite tricky, because if you reload() them one by one they can easily end up with bad references to old versions. Ideally the way forward would be to reload the whole Python interpreter when any file is updated, but in practice it seems some C extensions seem not to like this so it isn't generally done.

There are workarounds to reload a group of modules at once which can reliably update an application when one of its modules is touched. I use a deployment module that does this (which I haven't got around to publishing, but can chuck you a copy if you're interested) and it works great for my own webapps. But you do need a little discipline to make sure you don't accidentally start leaving references to your old modules' objects in other modules you aren't reloading; if you're talking loads of sites written by third parties whose code may be leaky, this might not be ideal.

In that case you might want to look at something like running mod_wsgi in daemon mode with an application group for each party and process-level reloading, and touch the WSGI script file when you've updated any of the modules.

You're right to complain; this (and many other WSGI deployment issues) could do with some standardisation help.

bobince
Thanks Bobince for a very thoughful answer! Please don't think I'm complaining however, I'd really like to know *why*, and how we can improve things. Finding shared hosting is hard for Python developers, and will remain that way unless we can simplify the shared execution model...
Leon
Just re-reading your answer. You draw a useful distinction between code (like in 'main.wsgi'), and modules imported into that code (and modules included into those modules, and so on). I'm off to run some experiments with mod_wsgi in daemon mode, as that seems to be the most promising avenue of investigation, thanks.
Leon