tags:

views:

256

answers:

4

Hi All,

Ive got a large XML set, which I would like to run some xpath on to make into a much smaller sub-set. Basically, I have this type of layout:

<root>
  <item>
    <collection1></collection1>
    <collection2></collection2>
    <collection3></collection3>
    ...
    <collection55></collection55>
    <name>item name</name>
    <timestamp>47398743598</timestamp>
    <another1></another1>
    <another2></another2>
    ...
  </item>
  <item>
   ...
  </item>
</root>

In other words, heaps of item nodes, and lots of other junk nodes that I dont care about.

I would like to run some xpath, to get that down to:

<root>
  <item>
    <name>item name</name>
    <timestamp>47398743598</timestamp>
  </item>
  <item>
   ...
  </item>
</root>

I have currently this type of thing:

//item/name

which only gets the name nodes,

so then Ive been trying this type of thing:

//item/name/parent::item

which gets the name nodes, and its parent (which is the item node) but also all of the sibling nodes of the name node, which is what Im trying to avoid!

Any help would be greatly appreciated

Cheers, Mark

A: 

You could try with the or (|) operator: //item/name|//item/timestamp

l0b0
that actually just returns the name and timestamp nodes, which although it what I was sort of after, I would ideally like them wrapped in their parent item node too
Mark
l0b0
+3  A: 

First off: You can't use XPath to get an XML document "down to something". You can use it to select nodes, that's all. If you want to change the XML document, use XSLT.

This expression:

//item/name/parent::item

does not select "the name nodes, and its parent", it selects the parent nodes of <name> nodes, and nothing else.

Strictly speaking, it selects all <item> nodes that happen to be parent of a <name> node that is itself child of an <item> node. Which is equivalent to using just "//item", when you think about it.

There is no way to select a structure of nodes. You can only select a list of nodes - a node set. You could then traverse those nodes and find out about their position in the document, but the node set itself is flat.

I think you need to explain more closely what you are trying to do. I could write an XSL transformation that does what you seem to intend, but unless I am sure what you intend... ;-)

EDIT:

Here is one minimalistic XSLT 1.0 approach that would do it.

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  <xsl:template match="root | item | name | timestamp">
    <xsl:copy>
      <xsl:apply-templates select="*" />
      <xsl:if test="count(*) = 0">
        <xsl:value-of select="text()" />
      </xsl:if>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="* | text()" />

</xsl:stylesheet>

Output for your sample (indentation mine):

<root>e
  <item>
    <name>item name</name>
    <timestamp>47398743598</timestamp>
  </item>
  <item>
   ...
  </item>
</root>
Tomalak
bummer, I was afraid of that, oh well, the thing is im trying to use a REST API to get xml using query string parameters. What I think ill have to do is to get the XML document using DOM4J or something and then use xslt to transform it. Thanks
Mark
I've added some XSLT that does the necessary changes to your input XML.
Tomalak
+1  A: 

Using XSLT, add this template to the identity transform:

<xsl:template match="item">
   <xsl:copy>
      <xsl:apply-templates select="name | timestamp"/>
   </xsl:copy>
</xsl:template>
Robert Rossney
+1  A: 

Tomalak's answer is great if you really want a trimmed XML document, but with one caveat: his select template will copy any name and timestamp node, not just the ones below an item element.

I suspect, however, that you don't really want a refined XML document, you just want the name and timestamp node for each item. Depending on the language you are using, you should be able to use xpath to give you a smaller node set to work with. In psuedo-code:

  1. select xpath for "/root/item". This should return some type of list. If you mention your implementation language, I can post a simple snippet.
  2. For each item, select the timestamp and name tags. There's no reason to care about the other nodes.

However, if you're sure you do want XML, use XSLT.

16bytes