views:

232

answers:

3

Let's say we make a request to a URL and get back the raw response, like this:

HTTP/1.1 200 OK
Date: Wed, 28 Apr 2010 14:39:13 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Set-Cookie: PREF=ID=e2bca72563dfffcc:TM=1272465553:LM=1272465553:S=ZN2zv8oxlFPT1BJG; expires=Fri, 27-Apr-2012 14:39:13 GMT; path=/; domain=.google.co.uk
Server: gws
X-XSS-Protection: 1; mode=block
Connection: close

<!doctype html><html><head>...</head><body>...</body></html>

What would be the best way to remove the HTTP headers from the response in C#? With regexes? Parsing it into some kind of HTTPResponse object and using only the body?

EDIT:

I'm using SOCKS to make the request; that's why I get the raw response.

+3  A: 

Headers and body are separated by empty line. it is really easier to do it without RE. Just search for first empty line.

Andrey
Mind adding an example? :)
Ed
what example? int i = 0; foreach (string s in response) if (string.IsNullOrEmpty(s)) break; else i++;
Andrey
I came up with this:if (responseString.IndexOf("HTTP/1.1 200 OK") > -1)responseString = responseString.Substring(responseString.IndexOf("\r\n\r\n"));
Ed
+1  A: 

If you use HttpWebrequest class you get an HttpWebResponse object returned which in turn contains a collection of Headers. You can then remove them, parse them or do whatever you wish with them.

Dan Diplo
I wish .NET had something like that for SOCKS. :(
Ed
+1  A: 

Note that using the substring method will leave you with a leading carriage return. I used this:

 string HTTPHeaderDelimiter = "\r\n\r\n";
 if (RawHTTPResponse.IndexOf("HTTP/1.1 200 OK") > -1)
    {
       HTTPPayload = RawHTTPResponse.Substring(RawHTTPResponse.IndexOf(HTTPHeaderDelimiter)+HTTPHeaderDelimiter.Length);
    }
    else
    {
       return;
    }
reltnek
Thanks for the suggestion. I was doing something like this already, but you've formatted it a bit nicer.
Ed