tags:

views:

106

answers:

3

As I understand XPath, it is a way to navigate through elements in an XML document.
The direction is XPath -> element.
How do you go the other way around? That is, compute the XPath from a known element value?

For example, how would you find the xpath of the "faq" link in the stackoverflow header ?
Language is not that important, I am more interested in the algorithm and/or libraries/techniques that would help me compute the XPath.

+1  A: 

Since XPath can select the n'th child element (ie /parentelement/child_element[2]), if you can figure out where in the tree the element is, then you should be able to generate an XPath back to it.

Thanatos
A: 

You didn't specify a language which makes this question difficult to answer. The python lxml module can do it

>>> a  = etree.Element("a")
>>> b  = etree.SubElement(a, "b")
>>> c  = etree.SubElement(a, "c")
>>> d1 = etree.SubElement(c, "d")
>>> d2 = etree.SubElement(c, "d")

>>> tree = etree.ElementTree(c)
>>> print(tree.getpath(d2))
/c/d[2]
>>> tree.xpath(tree.getpath(d2)) == [d2]
True

Even if you aren't using python you may find what you need in the module source code

SpliFF
+1  A: 

Here's a simple JS function to do it. It uses only previousSibling, nodeType, and parentNode, so it should be portable to other languages. However, the result is not readable (for humans), and it will not be particularly robust in the face of page changes.

In my experience, XPath is more useful if written by hand. However, you can certainly make an algorithm that will generate prettier (if probably slower) results.

function getXPath(node)
{
    if(node == document)
    return "/";
    var xpath = "";
    while (node != null && node.nodeType != Node.DOCUMENT_NODE)
    {
    print(node.nodeType);
    var pos = 1, prev;
    while ((prev = node.previousSibling) != null)
    {
        node = prev
        pos++;
    }
    xpath = "/node()[" + pos + "]" + xpath;
    node = node.parentNode;
    }
    return xpath;
}
Matthew Flaschen