tags:

views:

142

answers:

2

do you know any not strict xpath for java? (I want it to not check dtd and schema) and it would be cool if it dont care about correct xml.

A: 

Ok, first of all, "correct xml" can be interpreted in a couple of ways. If you mean "non-well-formed" (missing angle brackets, overlapping elements, etc.), no version of xpath would likely do anything useful with that. You'd be better off with some sort of regular expressions. If your XML isn't well-formed, I hope you have some clue of how it won't be well-formed, otherwise you have no hope of getting anywhere with it.

If you actually mean "invalid", which simply means it doesn't validate against a schema or DTD, then you can use predicates which compare against the "local-name()" function. For instance, if you want to find the "/foo/bar" element, ignoring schemas, then your xpath would look like this:

/*[local-name()='foo']/*[local-name()='bar']
David M. Karr
+1  A: 

You don't need to schema valid XML to use XPath. For non-well-formed XML, then I think you have two options:

  • generate a valid DOM tree from the file. Suggest sucking the file through JTidy or TagSoup. Once you have that, you can use XPath as normal.
  • generate some other tree shaped model, then use a customized Navigator for Jaxen's XPath. (Jaxen lets you use XPath on any model you want).
jamesh