views:

466

answers:

7

I think it's possible with jQuery, but any ASP.NET serverside code is good for my situation too.

With jQuery I can load a page to for example a div, and filter the div for <title> tag, but I think for heavy pages, it is not good to first read all of the content and then read the title tag.. or maybe it has a very simple solution? anyways I couldnt find anything about that from internet. thanks

A: 

It would be security risk for you to load any other web page into yours, just for title read... You should do this with server side scripting (asp.net, php, ...) and just output the title to your web page. Thing of some kind of caching because it is seamless to fetch titles on every request.

glavić
yea, thats what I was thinkin about, but I could find a solution, to read a website title which is given by a user as a link..
mohamadreza
A: 

There is no simple clean way to retrieve an external page's title. You could do it server side using a WebClient and parsing the response.

However it may be worth reviewing the requirement, is it really necessary, how much extra traffic and latency is it going to generate. Consider also that you could be generating load on the external site which is unaware all you want is a title, the page creation may be quite expensive.

AnthonyWJones
yep, right,what i am goin to do is, after users submit their website or fav links to the application, the app can read the title and save the link with the title to database, rather than to force the user to write a "title" field for their website..
mohamadreza
A: 
string title=Regex.Match(new System.Net.WebClient().DownloadString(url),(@"<title>(.*?)</title>"))[0].Groups[1].ToString();

try.i am not sure.

cjjer
synthax error for me on [0]
mohamadreza
A: 

Titles usually appear within the first few hundred bytes, so you could try a range request for the first 1KiB or so, try parsing that (with an error-correcting parser, since some closing tags will be missing) and if that fails fall back to loading the whole page.

finnw
A: 

I am not sure whether all servers support this.
See, if this helps


char[] data = new char[299];
System.Net.HttpWebRequest wr =(HttpWebRequest)WebRequest.Create("http://www.yahoo.com");
wr.AddRange("bytes", 0, 299);
HttpWebResponse wre = (HttpWebResponse)wr.GetResponse();
StreamReader sr = new StreamReader(wre.GetResponseStream());
sr.Read(data, 0, 299);
Console.WriteLine((data));
sr.Close();

EDIT: Try checking with some network monitoring tool to find out what is the text that servers send out. I used fiddler to see the output & wrote it to console.

EDIT2: I am assuming the title to be in the beginning of the page.

shahkalpesh
thanks, it worked but it is not reliable, it doesnt always return contents from the url you have entered.. redirects, page includes and ajax actions, make it broken..
mohamadreza
+1  A: 

cjjer almost got it right.

first, change the regex to: (?.*?)?

second, you need to create a match object first (just in case your uri does not have a title).

Match tMatch = new RegEx(@"(?.*?)?").Match(new System.Net.WebClient().DownloadString(url));

if ((null != tMatch) && (tMatch.IsSuccess)) { // yay. title = tMatch.Groups("Content").value; }

i dunno much about regex, it fires this error:parsing "(?.*?)?" - Unrecognized grouping construct
mohamadreza
HTML Decode that. Duno why they don't do that for you.
+1  A: 

okay thanks to cjjer and Boo, I've just read more about regex and finally the code below is working for me.

Dim qq As New System.Net.WebClient
    Dim theuri As New Uri(TextBox1.Text)
    Dim res As String = qq.DownloadString(theuri)
    Dim re As Regex = New Regex("<title\b[^>]*>(.*?)</title>", RegexOptions.Singleline)
    Dim ma As Match = re.Match(res)


    If Not ma Is Nothing And ma.Success Then
        Response.Write(ma.Groups(1).Value.ToString())
    Else
        Response.Write("error")
    End If

but anyways, the problem remains, this code is downloading the whole page and seeking through it, which one heavy websites it took more than 2 or 3 secconds to complete, but seems it is the only way as far as I know :| Is there any suggestions to refine this code?

mohamadreza