ansaurus

Question

Selecting (siblings) between two tags using XPath (in .NET)

Answer 1

A:

How abouts:

p/*[not(local-name()='br')]

And then index that expression for whatever term you want

EDIT:

For your indexing issue:

p/*[not(local-name()='br') and position() < x and position() > y]

LorenVS 2009-08-19 19:35:10

The first would be fine there weren't multiple elements between separators. The second is more promising but requires me to know the absolute indexes of the separators when I build the XPath expression, which I'd have to find out in a separate scanning step of some sort.

John Bartholomew 2009-08-19 20:08:22

Answer 2

A:

Try using the position() or maybe the count() methods. Here's a guess that might help you get the right syntax.

p/*[position() > position(/p/br[1]) and position() < position(/p/br[2])]

EDIT: Please read the comments before voting or commenting.

John Fisher 2009-08-19 19:55:07

This is promising, but the position() function (at least in the .NET implementation) does not take any arguments, which ruins it a bit. Since position() finds the position of the context node, I wondered if there might be a syntax that will let me call position() in a different context. I tried `../br[1]/position()` but this is also invalid – any other ideas along this line?

John Bartholomew 2009-08-19 20:11:15

Personally, I would use C# code to do this work. Getting the position of the two bounding elements in separate XPath calls would allow you to create a third XPath statement to retrieve the nodes you'd like to use.

John Fisher 2009-08-19 21:11:43

This is an invalid XPath expression that doesn't even compile. The position() function has only zero arguments.

Dimitre Novatchev 2009-08-21 13:28:52

Thanks Dimitre, but I did say it was a "guess" and that you should "read the comments" which already indicated that.

John Fisher 2009-08-21 16:01:13

Answer 3

+1 A:

If in your situation you always have exactly three 'pieces', separated by brs, you can use this XPath to get the middle 'piece':

//node()[preceding::br and following::br]

which uses the preceding and following axes to return all nodes between two brs, anywhere at all.

edit this is my test app (please excuse the XmlDocument, I am still working with .NET 2.0...)

using System;
using System.Xml;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            XmlDocument doc = new XmlDocument();
            doc.LoadXml(@"
<x>
 <h3>Section title</h3>
 <p>
  <b>Component A</b><br />
  Component B <i>includes</i> <strong>multiple elements</strong><br />
  Component C
 </p>
</x>
            ");

            XmlNodeList nodes = doc.SelectNodes(
                "//node()[preceding::br and following::br]");

            Dump(nodes);

            Console.ReadLine();
        }

        private static void Dump(XmlNodeList nodes)
        {
            foreach (XmlNode node in nodes)
            {
                Console.WriteLine(string.Format("-->{0}<---", 
                                  node.OuterXml));                    
            }
        }
    }
}

And this is the output:

-->
      Component B <---
--><i>includes</i><---
-->includes<---
--><strong>multiple elements</strong><---
-->multiple elements<---

As you can see, you get an XmlNodeList with all the stuff between the brs.

The way I think about it is: This XPath returns any node anywhere, so long as for that node, the preceding axis contains a br, and the following axis contains a br.

AakashM 2009-08-19 20:14:16

A quick test suggests that this doesn't work (unless I've made some other stupid mistake). My understanding was that when a node-set expression appears within a filter clause, it acts as a 'true' value if it contains any items, whereas for this to work it would instead have to perform a set-membership test (ie, act as 'true' only if the node being filtered is within the set returned by the node-set expression). In this case, there are always some nodes before a br, and some nodes after, so it just selects everything. Please correct my understanding if I've got this wrong!

John Bartholomew 2009-08-19 20:22:21

Ah, I see. I was thinking about it the wrong way round. Thanks for the clarification.

John Bartholomew 2009-08-19 21:17:16

Answer 4

A:

This can easily be done with XPath 2.0 or with XPath 1.0 hosted by XSLT.

With XPath 1.0 hosted by .NET this can be achieved in several steps:

Make the appropriate "p" node the current node.
Find the number of all <br /> children of the current "p" node:

count(br)
if N is the count, determined in step 2. for $k in 0 to N do:

3.1 Find all nodes that are preceded by $k <br /> elements:

node()[not(self::br) and count(preceding::br) = $k]

3.2 For every such node found, get its string value

3.3 Concatenate all string values obtained in step 3.2. The result of this concatenation is all the text contained in the given paragraph.

Note: In order to substitute what should stand for $k in step 3.1 it is necessary to dynamically construct this expression.

Dimitre Novatchev 2009-08-21 13:49:12

ansaurus

tags:

views:

answers:

Selecting (siblings) between two tags using XPath (in .NET)

related questions