tags:

views:

56

answers:

2

Is it possible with XPath to get a concatenated view of all of the children of a node? I am looking for something like the JQuery .html() method.

For example, if I have the following XML:

<h3 class="title">
    <span class="content">this</span>
    <span class="content"> is</span>
    <span class="content"> some</span>
    <span class="content"> text</span>
</h3>

I would like an XPath query on "h3[@class='title']" that would give me "this is some text".

That is the real question, but if more context/background is helpful, here it is: I am using XPath and I used this post to help me write some complex XSL. My source XML looks like this.

<h3 class="title">Title</h3>
<p>
    <span class="content">Some</span>
    <span class="content"> text</span>
    <span class="content"> for</span>
    <span class="content"> this</span>
    <span class="content"> section</span>
</p>
<p>
    <span class="content">Another</span>
    <span class="content"> paragraph</span>
</p>
<h3 class="title">
    <span class="content">Title</span>
    <span class="content"> 2</span>
    <span class="content"> is</span>
    <span class="content"> complex</span>
</h3>
<p>
    <span class="content">Here</span>
    <span class="content"> is</span>
    <span class="content"> some</span>
    <span class="content"> text</span>
</p>

My output XML considers each <h3> as well as all <p> tags until the next <h3>. I wrote the XSL as follows:

<xsl:template match="h3[@class='title']">
...
    <xsl:apply-templates select="following-sibling::p[
        generate-id(preceding-sibling::h3[1][@class='title'][text()=current()/text()])
        =
        generate-id(current())
    ]"/>
...
</xsl:template>

The problem is that I use the text() method to identify h3s that are the same. In the example above, the "Title 2 is complex" title's text() method returns whitespace. My thought was to use a method like JQuery's .html that would return me "Title 2 is complex".

Update: This might help clarify. After the transform, the desired output for the above would look something like this:

<section>
    <title>Title</title>
    <p>
        <content>Some</content>
        <content> text</content>
        <content> for</content>
        <content> this</content>
        <content> section</content>
    </p>
    <p>
        <content>Another</content>
        <content> paragraph</content>
    </p>
</section>
<section>
    <title>
        <content>Title</content>
        <content> 2</content>
        <content> is</content>
        <content> complex</content>
    </title>
    <p>
        <content>Here</content>
        <content> is</content>
        <content> some</content>
        <content> text</content>
    </p>
</section>
A: 
h3[@class='title']/span[@class='content']/text()

Like this?

h3[@class='title']/descendant::*/text()

Or this?

nuqqsa
Yes, this does seem to help, but it reverses my problem. Now it will pick up the text in "Title 2 is complex" but it will not pick up the text in "Title" where there are no spans.Is there XPath that will pick up both?The worst part is, there are currently only spans in there, but there really could be any tag. I am really hoping for a solution that will pull the text out of any tag.
Brian
A new solution: this will pick up all descendant's content (recursively!)
nuqqsa
Ah! It seems like the solution is your suggestion with a // operator in place of the span (which will select nodes in the document from the current node that match the selection no matter where they are). This seems to work. It seems like I can't put a code snippet in the comment, but the h3 section of the select clause now looks like this: h3[1][@class='title'][//text()=current()//text()]).
Brian
I tried the descendant solution in my XSL and it had the same effect as the /span.../text() option. It matches the complex title but not the simple one. It seems like //text() is the only one that matches both.
Brian
A: 

From http://www.w3.org/TR/xpath/#dt-string-value

The string-value of an element node is the concatenation of the string-values of all text node descendants of the element node in document order.

So, you are looking for the function string()

From http://www.w3.org/TR/xpath/#function-string

A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned

Anyway, keep in mind that the string() is often applied implicitly, for example: value-of, node-set comparisons, etc.

But if you really need is to apply templates to the elements "p" that follow a certain element "h3", I recommend using the expression for node-set subtraction. In this case:

<xsl:apply-templates select="following::p[count(.|following::p) > count(current()/following::h3/following::p)]"/>

Note 1: It is seen in this example as the semantics of XHTML is in some cases rather vague.

Note 2: Why is it necessary to apply templates explicitly to the elements "p"? If this is so because it takes some data that it provides the element "h3" which precedes it, the same can be achieved with the expression "preceding::h3" within a template for the elements "p".

Edit: Now, with your output sample...

XML input:

<root>
<h3 class="title">Title</h3> 
<p> 
    <span class="content">Some</span> 
    <span class="content"> text</span> 
    <span class="content"> for</span> 
    <span class="content"> this</span> 
    <span class="content"> section</span> 
</p> 
<p> 
    <span class="content">Another</span> 
    <span class="content"> paragraph</span> 
</p> 
<h3 class="title"> 
    <span class="content">Title</span> 
    <span class="content"> 2</span> 
    <span class="content"> is</span> 
    <span class="content"> complex</span> 
</h3> 
<p> 
    <span class="content">Here</span> 
    <span class="content"> is</span> 
    <span class="content"> some</span> 
    <span class="content"> text</span> 
</p> 
</root>

XSLT stylesheet:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">

<xsl:template match="/">
<root>
<xsl:apply-templates select="/root/h3" />
</root>
</xsl:template>

<xsl:template match="h3">
<section>
<title>
<xsl:apply-templates select="node()" />
</title>
<xsl:apply-templates select="following-sibling::p[count(preceding-sibling::h3[1]|current())=1]"/>
</section>
</xsl:template>

<xsl:template match="p">
<xsl:copy>
<xsl:apply-templates select="node()" />
</xsl:copy>
</xsl:template>

<xsl:template match="span">
<content>
<xsl:value-of select="." />
</content>
</xsl:template>

</xsl:stylesheet>

XML output:

<root>
<section>
<title>Title</title>
<p>
<content>Some</content>
<content> text</content>
<content> for</content>
<content> this</content>
<content> section</content>
</p>
<p>
<content>Another</content>
<content> paragraph</content>
</p>
</section>
<section>
<title>
<content>Title</content>
<content> 2</content>
<content> is</content>
<content> complex</content>
</title>
<p>
<content>Here</content>
<content> is</content>
<content> some</content>
<content> text</content>
</p>
</section>
</root>
Alejandro
I am not sure exactly where to use the string() function. I tried: [string()=current()/string()] but I get an error saying "Unknown nodetype: string".Re Note 2, I think I understand the question, and I am not sure I know the answer. There may be a better implementation. My solution came out of the stack overflow post I link to at the top that describes how to select all following siblings until another siblings. I want to apply the template to all "p" tags between one "h3" and another, and that turned out to be quite a challenge! But the XSL above seems to do so.
Brian
Brian: you wrote:would like an XPath query on "h3[@class='title']" that would give me "this is some text".So, "string(h3[@class='title'])" give you what you want.I don't know what's your goal. You wrote: I want to apply the template to all "p" tags between one "h3" and another.So, if your current node in the transformation is some h3, then the xsl:apply-templates line I gave you would do.
Alejandro
I see now where to put the string() but it doesn't seem to help me. I would need it nested further and it doesn't work there. I will edit the question with the desired output and maybe that will clear things up. The answer I accepted has a solution that works. Thanks!
Brian
@Brian: As I wrote you, string() is often applied implicitly. Your XPath expression "following-sibling::p[generate-id(preceding-sibling::h3[1][@class='title'][text()=current()/text()])=generate-id(current())" can be simplify into "following-sibling::p[generate-id(preceding-sibling::h3[1][@class='title'][.=current()])=generate-id(current())". Better: "following-sibling::p[count(preceding-sibling::h3[1]|current())=1]"
Alejandro
Thanks Alejandro! This is much cleaner XSL than I had before. I should have posted the output in the first place.
Brian