views:

4318

answers:

1

My XML looks like :

<?xml version=\"1.0\"?>
<itemSet>
       <Item>one</Item>
       <Item>two</Item>
       <Item>three</Item>
       .....maybe more Items here.
</itemSet>

Some of the individual Item may or may not be present. Say I want to retrieve the element <Item>two</Item> if it's present. I've tried the following XPaths (in C#).

  • XMLNode node = myXMLdoc.SelectSingleNode("/itemSet[Item='two']") --- If Item two is present, then it returns me only the first element one. Maybe this query just points to the first element in itemSet, if it has an Item of value two somewhere as a child. Is this interpretation correct?

So I tried:

  • XMLNode node = myXMLdoc.SelectSingleNode("/itemSet[Item='two']/Item[1]") --- I read this query as, return me the first <Item> element within itemSet that has value = 'two'. Am I correct?

This still returns only the first element one. What am I doing wrong? In both the cases, using the siblings I can traverse the child nodes and get to two, but that's not what I am looking at. Also if two is absent then SelectSingleNode returns null. Thus the very fact that I am getting a successfull return node does indicate the presence of element two, so had I wanted a boolean test to chk presence of two, any of the above XPaths would suffice, but I actually the need the full element <Item>two</Item> as my return node.

[My first question here, and my first time working with web programming, so I just learned the above XPaths and related xml stuff on the fly right now from past questions in SO. So be gentle, and let me know if I am a doofus or flouting any community rules. Thanks.]

+5  A: 

I think you want:

myXMLdoc.SelectSingleNode("/itemSet/Item[text()='two']")

In other words, you want the Item which has text of two, not the itemSet containing it.

You can also use a single dot to indicate the context node, in your case:

myXMLdoc.SelectSingleNode("/itemSet/Item[.='two']")

EDIT: The difference between . and text() is that . means "this node" effectively, and text() means "all the text node children of this node". In both cases the comparison will be against the "string-value" of the LHS. For an element node, the string-value is "the concatenation of the string-values of all text node descendants of the element node in document order" and for a collection of text nodes, the comparison will check whether any text node is equal to the one you're testing against.

So it doesn't matter when the element content only has a single text node, but suppose we had:

<root>
  <item name="first">x<foo/>y</item>
  <item name="second">xy<foo/>ab</item>
</root>

Then an XPath expression of "root/item[.='xy']" will match the first item, but "root/item[text()='xy']" will match the second.

Jon Skeet
Thanks a ton Jon. Both your solutions worked like a charm. While on topic, can I ask, what's the difference between using text() and .? In my sample XML it's a simple element with just one text value, so both solutions return the same thing. Is there a complicated XML situation where one is preferred over the other or results are different?(Other than marking yr ans correct and voting up, is there anything SO communitywise I can do to thank you?)
areaMan
@areaMan: To be honest, I'm not sure of the difference between "text()" and ".". My XPath-fu is fairly minimal :) You might want to look at the XPath specs to find out more: http://www.w3.org/TR/xpathAs for what you can do to thank me... find a question you can answer, and answer it as well as you can. Oh, and keep asking good questions :)
Jon Skeet
Having said that about not knowing the difference, I know *some* of the differences between . and text() - will edit.
Jon Skeet
Thanks. That helps.
areaMan