views:

43

answers:

2

Hi

I have the following simple xml:

<root>
 <item>
  <d>2002-05-30T09:00:00</d>
 </item>
 <item>
  <d>2005-05-30T09:00:00</d>
 </item>
 <item>
  <d>2003-05-30T09:00:00</d>
 </item>
</root>

Now I want to find the minimum or maximum dateTime node using XPath.

My solution at the moment is:

/root/item[not(number(translate(./d, 'TZ:-', '.')) <= number(translate(following-sibling::item, 'TZ:-', '.')))][not(number(translate(./d, 'TZ:-', '.')) <= number(translate(preceding-sibling::item, 'TZ:-', '.')))][1]/d

It works but is is ugly as hell and not very efficient. Basically it converts the dateTime to a number and then compares them with each other. I adapted this from here.

What is the nicest way to do this?

Cheers

neo

+5  A: 

You could not in XPath 1.0 if you will not know in advance the number of item because every function wich has a no node-set argument cast its argument taking the first node in node-set, and order comparison operator doesn't work with strings.

In XPath 2.0 you could use:

max(/root/item/d/xs:dateTime(.))
Alejandro
Works very well, I really like this syntax, see also my comment on the other answer.
neo
@neo: I'm glad it has helped you. Now that you know you have XQuery 1.0 and XPath 2.0 you will find a lot improvements from XPath 1.0: in this case not only `fn:max` but also the posibility to use expressions as final step in path. Also I've retaged your question.
Alejandro
+2  A: 

@neo, the XPath expression you list doesn't work when I test it. Try a different data set and you'll see:

<root>
   <item>
      <d>2003-05-30T09:00:00</d>
   </item>
   <item>
      <d>2002-05-30T09:00:00</d>
   </item>
   <item>
      <d>2005-05-30T09:00:00</d>
   </item>
</root>

Your XPath produces 2003-05-30T09:00:00, which obviously is not the max.

And it makes sense that it doesn't work, because the preceding-sibling:: and following-sibling:: axes inside the translate() functions will only yield one sibling each. You're trying to go a general (set) comparison over all the siblings on each axis, but the first argument to translate() has to get converted to a string, before the general comparison operator has a chance to do its thing. Converting a nodeset to a string ignores all nodes except the first one in document order.

Furthermore, translate(./d, 'TZ:-', '.') gives you results like 2003.05.30.09.00.00. That's not a valid number, beyond the '5'. Your test data only works because the years are all different. You would get better results with translate(./d, 'TZ:-', '') which would yield 20030530090000.

Alejandro says it's not possible to do this in XPath 1.0, and he may be right. Let's try it, and maybe we'll learn something even if we don't succeed.

Next, I would try to use the general comparison outside the translate function, so that it can compare whole node-sets. Something like this naive attempt:

/root/item[
    not(following-sibling::item[
         translate($current/d, 'TZ:-', '') &lt;= translate(./d, 'TZ:-', '')])
    and not(preceding-sibling::item[
         translate($current/d, 'TZ:-', '') &lt;= translate(./d, 'TZ:-', '')])]

However this is incomplete as shown by the pseudo-variable $current, which is supposed to refer to the outermost item, the one that is the context node outside all predicates. Unfortunately, XPath 1.0 does not give us a way to refer to that outer context when another context has been pushed on the stack by an inner predicate.

(I seem to recall that \some implementations of XSLT, such as maybe MSXML, allow you to do this using an extended feature like current(1), but I can't find information on that at the moment. Anyway you asked for an XPath solution, and current() is not XPath.)

At this point I'm going to agree with Alejandro that it's impossible in pure, standard XSLT 1.0.

If you specify the environment you're using XPath in, e.g. XSLT, or Javascript, or XQuery, we can probably suggest an efficient way to get what you need. If it's XPath 2.0, Alejandro has your answer.

If you have XQuery 1.0, it should support XPath 2.0, so you can use Alejandro's solution, with doc() to access your input XML document:

max(doc("myInput.xml")/root/item/d/xs:dateTime(.))
LarsH
@larsH: general maximum XPath 1.0 expression `$nodes[not($nodes > .)]` can't be used here because this string-values can't be converted to number. If you use `fn:translate` then you must perform the function call for each node in the node set (sequence is a type of XSLT 2.0), so you must know the `item` count in advance: `/root/item[. >= translate(../item[1],'TZ:-','')][. >= translate(../item[2],'TZ:-','')][. >= translate(../item[3],'TZ:-','')]`
Alejandro
I have to say, I really like the XPath 2.0 version, but I have only XPath 1.0 available. Although it means more work in my case I can also use XQuery here. How would that look like?
neo
If you can use XQuery, it should have XPath 2.0 in it. So you can use Alejandro's solution: `max(doc("myInput.xml")/root/item/d/xs:dateTime(.))` Disclaimer: I'm not an XQuery user.
LarsH
@neo - I tested the above XQuery expression and it worked.
LarsH
I just realized that I actually have XPath 2.0 available (in YAWL workflow engine) because it uses XQuery anyway everywhere. I'm sorry for this confusion, I would've accepted both posts as answers because yours is right for XPath 1.0 and Alejandro's for XPath 2.0. I will pick the XPath 2.0 solution as my answer because that's what I'm using in the end.
neo
@neo - glad you were able to get it to work.
LarsH