views:

27

answers:

1

Consider this hypothetical xml:

<myApi xmlns="urn:something" xmlns:bla="urn:hello">
  <argument1>foo</argument1>
  <argument2>
     <p xmlns="http://www.w3.org/1999/xhtml"&gt;Some paragraph of text. <img src="http://www.example.org/hello.png" bla:test="oi" /></p>
  </argument2>
</myApi>

What would be the best way in PHP to parse out the 2nd argument. The full xml-structure must be stored in the database, but consideration has to be made for the fact that it could reference already-declared xml namespaces higher up in the document.

Is there a good way to take out a chunk of XML using any of the readily available PHP parsing libraries, and store all the semantic information?

+1  A: 

Have you looked at SimpleXML? It does parsing without validating, so you can throw any well-formed XML at it and sort out the valid and invalid tags yourself. You can also pull out XML fragments at any level. The only thing it struggles a little with is namespaces.

staticsan
Namespaces is the big thing for me. SimpleXML will just give me a 'substring' of a fragment, but the fragment doesn't necessarily have to be valid (missing namespaces).
Evert
AFAIK, its limitation is creating tags as part of namespaces. There are ways to do it, but I hear it's a trifle awkward. I haven't used SimpleXML to work with namespaced tags, note, just plain, unvalidated XML and it works very well with that.
staticsan