CGI doesn't scale because each request forks a brand new server process. It's a lot of overhead. mod_wsgi avoid the overhead by forking one process and handing requests to that one running process.
Let's assume the application is the worst kind of cgi.
The worst case is that it has files like this.
my_cgi.py
import cgi
print "status: 200 OK"
print "content-type: text/html"
print
print "<!doctype...>"
print "<html>"
etc.
You can try to "wrap" the original CGI files to make it wsgi.
wsgi.py
import cStringIO
def my_cgi( environ, start_response ):
page = cStringIO.StringIO()
sys.stdout= page
os.environ.update( environ )
# you may have to do something like execfile( "my_cgi.py", globals=environ )
execfile( "my_cgi.py" )
status = '200 OK' # HTTP Status
headers = [('Content-type', 'text/html')] # HTTP Headers
start_response(status, headers)
return page.getvalue()
This a first step to rewriting your CGI application into a proper framework. This requires very little work, and will make your CGI's much more scalable, since you won't be starting a fresh CGI process for each request.
The second step is to create a mod_wsgi
server that Apache uses instead of all the CGI scripts. This server must (1) parse the URL's, (2) call various function like the my_cgi
example function. Each function will execfile
the old CGI script without forking a new process.
Look at werkzeug for helpful libraries.
If your application CGI scripts have some structure (functions, classes, etc.) you can probably import those and do something much, much smarter than the above. A better way is this.
wsgi.py
from my_cgi import this_func, that_func
def my_cgi( environ, start_response ):
result= this_func( some_args )
page_text= that_func( result, some_other_args )
status = '200 OK' # HTTP Status
headers = [('Content-type', 'text/html')] # HTTP Headers
start_response(status, headers)
return page_text
This requires more work because you have to understand the legacy application. However, this has two advantages.
It makes your CGI's more scalable because you're not starting a fresh process for each request.
It allows you to rethink your application, possibly changing it to a proper framework. Once you've done this, it's not very hard to take the next step and move to TurboGears or Pylons or web.py for a very simple framework.