views:

221

answers:

4

Is there a standard function that will convert http headers into a python dictionary, and one to convert back?

They would need to support header folding, of course.

+1  A: 

I'm not entirely sure, but this seems to be along the lines of what you are looking for

Hope this helps

inspectorG4dget
+1  A: 

In case you don't find any library solving the problem, here's a naive, untested solution:

def fold(header):
  line = "%s: %s" % (header[0], header[1])
  if len(line) < 998: 
    return line
  else: #fold
    lines = [line]
    while len(lines[-1]) > 998:
      split_this = lines[-1]
      #find last space in longest chunk admissible
      split_here = split_this[:998].rfind(" ")
      del lines[-1]
      lines = lines + [split_this[:split_here]),
                       split_this[split_here:])] #this may still be too long
                                                 #hence the while on lines[-1]
    return "\n".join(lines)

def dict2header(data):
  return "\n".join((fold(header) for header in data.items()))

def header2dict(data):
  data = data.replace("\n ", " ").splitlines()
  headers = {}
  for line in data:
    split_here = line.find(":")
    headers[line[:split_here]] = line[split_here:]
  return headers
badp
I would be surprised if this actually worked in all cases. :)
badp
Thanks, I'll use this, and fix what errors I find if I can't get anything else.
Jeffrey Aylesworth
I've turned this answer into community wiki, so you can merge the fixes as/if required.
badp
+3  A: 

Rather than build your own using sockets etc I would use httplib Thus would get the data from the http server and parse the headers into a dictionary e.g.

import httplib
conn = httplib.HTTPConnection("www.python.org")
conn.request("GET", "/index.html")
r1 = conn.getresponse()

dict = r1.getheaders()
print(dict)

gives

[('content-length', '16788'), ('accept-ranges', 'bytes'), ('server', 'Apache/2.2.9 (Debian) DAV/2 SVN/1.5.1 mod_ssl/2.2.9 OpenSSL/0.9.8g mod_wsgi/2.5 Python/2.5.2'), ('last-modified', 'Mon, 15 Feb 2010 07:30:46 GMT'), ('etag', '"105800d-4194-47f9e9871d580"'), ('date', 'Mon, 15 Feb 2010 21:34:18 GMT'), ('content-type', 'text/html')]

and methods for put to send a dictionary as part of a request.

Mark
A: 

And this is my version without for iteration:

import re
req_line = re.compile(r'(?P<method>GET|POST)\s+(?P<resource>.+?)\s+(?P<version>HTTP/1.1)')
field_line = re.compile(r'\s*(?P<key>.+\S)\s*:\s+(?P<value>.+\S)\s*')

def parse(http_post):
    first_line_end = http_post.find('\n')
    headers_end = http_post.find('\n\n')
    request = req_line.match(
        http_post[:first_line_end]
    ).groupdict()
    headers = dict(
        field_line.findall(
            http_post[first_line_end:headers_end]
        )
    )
    body = http_post[headers_end + 2:]
    return request, headers, body
Altraqua