views:

939

answers:

2

I have a web service that acts as an interface between a farm of websites and some analytics software. Part of the analytics tracking requires harvesting the page title. Rather than passing it from the webpage to the web service, I would like to use HTTPWebRequest to call the page.

I have code that will get the entire page and parse out the html to grab the title tag but I don't want to have to download the entire page to just get information that's in the head.

I've started with

HttpWebRequest request = (HttpWebRequest)HttpWebRequest.Create("url");
request.Method = "HEAD";

+3  A: 

Great idea, but a HEAD request only returns the document's HTTP headers. This does not include the title element, which is part of the HTTP message body.

R. Bemrose
So is there anyway to get this information without downloading the entire page?
Well, you could read the response in chunks, but I think that the framework itself would have already received the entire response even though you haven't processed it.
R. Bemrose
A: 

So I would have to go with something like...

HttpWebRequest req   = (HttpWebRequest)WebRequest.Create(URL);
HttpWebResponse resp = (HttpWebResponse)req.GetResponse();
Stream st            = resp.GetResponseStream();
StreamReader sr      = new StreamReader(st);
string buffer        = sr.ReadToEnd();
int startPos, endPos;
startPos = buffer.IndexOf("<title>",
StringComparison.CurrentCultureIgnoreCase) + 7;
endPos = buffer.IndexOf("</title>",
StringComparison.CurrentCultureIgnoreCase);
string title = buffer.Substring(startPos, endPos - startPos);
Console.WriteLine("Response code from {0}: {1}", s,
        resp.StatusCode);
Console.WriteLine("Page title: {0}", title);
sr.Close();
st.Close();