views:

32

answers:

1

I'm using the HTML Agility pack to parse an ASPX file inside Visual Studio.

I'm searching for an element with a specified ID attribute.

The code I'm using is:

var html = new HtmlAgilityPack.HtmlDocument();
html.LoadHtml(docText);
if (html.DocumentNode != null)
{
     try
     {
          var tagsWithId = html.DocumentNode.SelectNodes(string.Format("//[@id='{0}']", selector.Id));

However, when I run this code it throws the exception "Expression must evaluate to a node-set".

Can anyone tell me why this "must" evaluate to a node-set? Why can't it simply return no nodes (the next line calls tagsWithId.Count)? Surely the HtmlNodeCollection that is returned by the SelectNodes method can contain 0 nodes?

Or is the error due to a malformed Xpath expression? [The selector ID which I'm testing this with definitely exists in the file as <div id="thisId">.]

Is it even possible to load an ASPX file straight from Visual Studio (I'm building an add-in) or will this contain XML errors, and will I instead have to load the output HTML stream (i.e without the page declaration at the start of the file, etc.)?

A: 

The problem is in the argument to SelectNodes():

//[@id='{0}']

(after carrying out the replacement) is not a sybtactically legal XPath expression. So the problem is not that the XPath expresiion "returns no nodes" -- the problem is that it is syntactically illegal.

As per the XPath W3C Spec:

"// is short for /descendant-or-self::node()/"

Thus the above is expanded to:

/descendant-or-self::node()/[@id='{0}']

Notice, that the last location step has no node-test and starts with the predicate. This is illegal according to the syntax rules of XPath.

Probably you want:

//*[@id='{0}']
Dimitre Novatchev
Thankyou very much Dimitre, you're absolutely correct.
awj