views:

152

answers:

2
META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=iso-8859-1" />
TITLE>Microsoft Corporation
META http-equiv="PICS-Label" content="(PICS-1.1 "http://www.rsac.org/ratingsv01.html" l gen true r (n 0 s 0 v 0 l 0))" />
META NAME="KEYWORDS" CONTENT="products; headlines; downloads; news; Web site; what's new; solutions; services; software; contests; corporate news;" />
META NAME="DESCRIPTION" CONTENT="The entry page to Microsoft's Web site. Find software, solutions, answers, support, and Microsoft news." />
META NAME="MS.LOCALE" CONTENT="EN-US" />
META NAME="CATEGORY" CONTENT="home page" />

I'd like to know what XPATH I would need to get the value of the Content attribute of the Category meta tag using HTML Agility Pack. (I removed the first < of each line in the html code so it would post).

A: 

For a long time HtmlAgilityPack didn't had the ability to directly query an attribute value. You had to loop over the list of meta nodes. Here's one way -

var doc = new HtmlDocument();
doc.LoadHtml(htmlString);

var list = doc.DocumentNode.SelectNodes("//meta"); 
foreach (var node in list)
{
    string content = node.GetAttributeValue("content", "");
}

But looks like in there is an experimental xpath release that will let you do that.

doc.Document.SelectNodes("//meta/@content") 

will return a list of HtmlAttribute objects.

Rohit Agarwal
A: 
Actually, a slightly better way would be to use <code>title = doc.DocumentNode.SelectSingleNode("//meta[@name='title']").GetAttributeValue("content", String.Empty)</code>
Or better yet would betitle = doc.DocumentNode.SelectSingleNode("//meta[@name='title']/@content")
The above one title = doc.DocumentNode.SelectSingleNode("//meta[@name='title']/@content").ToString won't work...