views:

30

answers:

1

I am using XQuery to extract content from html pages. The html body structure is of this kind:

 <td>
      <a href ="hw1">xyz </a>
          Hello world 1 
        <a href="hw2">Helloworld 2</a>
          Helloworld 3         
 </td>

My XQuery expression for extracting the text is as follows:

  //a[starts-with(@href,'hw1')]/following-sibling::text()

This expression gives me :

Helloworld 1 Helloworld 2 Helloworld 3

I would like to have it in this fashion: Helloworld 1 Helloworld 2 Helloworld 3 or Helloworld 1 Helloworld 3

How do I specify to parse the text enclosed by tags

A: 

I'm not really clear what you're looking for, but

let $content := 
 <td>
      <a href ="hw1">xyz </a>
          Hello world 1 
        <a href="hw2">Helloworld 2</a>
          Helloworld 3         
 </td>

return $content/text()

gives you the text nodes directly under the <td>. I don't see a difference between what you're getting and what you want... perhaps your post lost some formatting?

Dave Cassel