ansaurus

Question

How to determine whether a web page has RSS or not in C#

Answer 1

+1 A:

I expect you would have to load the page into a dom (XmlDocument, XDocument or HtmlDocument) and check for any nodes like:

<link rel="alternate" type="application/atom+xml" ...

This should be (in xpath) something like "/html/head/link[@rel='alternate' and @type='application/atom+xml']" - then look at @title and @href.

Marc Gravell 2009-11-19 12:17:15

Answer 2

+1 A:

Instead of loading the HTML into an XMLDocument (which may not be possible if it isn't XHTML compliant), try the HTML Agility Pack instead. It gives you XMLDocument-like syntax but you can use malformed HTML with it.

but generally, you would look for that link tag in the pages head..

spmason 2009-11-19 12:19:48

Answer 3

+1 A:

Use a regular expression to check the HTML for the link tag.

An exhaustive approach would be to spider each href link and examine the content-type and presence of rss or atom tags...

Codebrain 2009-11-19 12:22:43

The `<center>` cannot hold it is too late. http://stackoverflow.com/questions/1732348#1732454

Marc Gravell 2009-11-19 12:31:36

considering he is searching for a known tag it's not unreasonable to use RegEx in this case IMO

Codebrain 2009-11-19 13:00:05

ansaurus

tags:

views:

answers:

How to determine whether a web page has RSS or not in C#

related questions