views:

71

answers:

3

I am looking for a library or function call in python or an associated library that would let me feed in a raw stream of text data representing an HTTP req/res and that would spit out that information is some sort of meaningful form like a dictionary or list. I do not want to use some built in class or create a bunch of new objects, in my program I am receiving in some raw data and that is just what I've got to work with. Is there already a solution out there for this, or do I have to write an HTTP parser myself?

Edit: Let me clarify what exactly I'm looking to do. I'm looking for something that would take a string like:

GET /index.html HTTP/1.1 \r\n
Host:www.stackoverflow.com \r\n
User-Agent:Firefox \r\n
ect.

And send me back something encapsulating the method, HTTP version, headers and all the rest.

A: 

Something like this? http://docs.python.org/library/htmllib.html

Jordan
Worth noting from beyond the jump: 'Deprecated since version 2.6: The htmllib module has been removed in Python 3.0.'
Wilduck
Valid point, I have not moved to Python 3.0 yet thats why I thought it was still relevant.
Jordan
+1  A: 

http://docs.python.org/library/httplib.html I believe this is the library you are looking for. A little change in name for python 3 but otherwise good to go.

Gabriel
I looked at that but could not quite find what I needed. Correct me if I'm wrong, but doesn't that lib revolve around actually making/receiving requests? I don't want to make/receive any requests, I just want to look at raw data. Could you give an example of the method you believe would do this?
themaestro
Well the http request, when you recieve it contains the raw header data, and you use this library to create a header dictionary. This is what your post describes. If you are looking to recieve raw text data over a socket you might try http://docs.python.org/library/socket.html but you will be recreating a lot of wheel parts. Conversely if you are receiving the raw text and want a way to parse it into a valid request header you can try http://deron.meranda.us/python/httpheader/pydoc#-parse_token_or_quoted_string but I have not tried this myself.
Gabriel
A: 

I'd start by looking at WebOb. I think the cgi module in the standard library also has an HTTP parser.

Marius Gedminas