I'm having problems getting to the rss link that tells the browser where the rss is for the site. The link is found in the <head> tag of the html here is an example of what the link looks like. 
<link rel="alternate" type="application/rss+xml" title="CNN - Top Stories [RSS]" href="http://rss.cnn.com/rss/cnn_topstories.rss" />
My original approach was to treat the site like an XML file and look through the tags, but most sites have an arbitrary number of <meta> tags that forget to have a ending /> so the <link> tag I'm looking for becomes a child of a random <meta> tag.
Now I'm thinking of just treating the site like a string and looking for the <link> tag in it, but this causes problems since the <link> tag can have its attributes in any order possible. Of course I can work around this, but I would prefer something a bit neater than look for type="application/rss+xml" then look to the left and right of it for the first href it sees.