tags:

views:

46

answers:

2

Hi there, I'm trying to parse a quite large XML-sheet with PHP, but I'm fairly new to it. The XML-sheet contains a couple of thousands of records.

Here is an example of the structure used within the sheet:

<familyList>
<family>
<familyID>1234</familyID>
<familyDescription>The Jonathans</familyDescription>
<childrenList>
<child>Suzan</child>
<child>Fred</child>
<child>Harry</child>
</childrenList>
</family>
<family>
<familyID>1235</familyID>
<familyDescription>The Gregories</familyDescription>
<childrenList>
<child>Anthony</child>
<child>Lindsay</child>
</childrenList>
</family>
</familyList>

As I'm fairly new to XML-parsing using PHP, I wonder what would be the best way to parse this nested XML-sheet into an array. I need to convert the XML to an array so I can insert the data into a MySQL database afterwards.

Could you please give me a push in the right direction as I haven't been succesful puzzling out a solution sofar?..

Thanks!

+3  A: 

When you are parsing large XML file, you should use a XML Pull Parser (XPP) to do so. PHP has an implementation of a pull parser, it's called XMLReader. Also storing XML as an array for large file will consume a lot of memory.

What I recommend you is to use XMLReader and as you parse the data, you can insert it in your database without waiting for the end of the file. It won't use huge amount of memory and it will be faster.

This tutorial can be a good start to understand how to use XMLReader with PHP.

Has pointed out if the comments, XML Parser can be an other solution for parsing large XML file.

HoLyVieR
I saw your comment below the now deleted answer of ircmaxell. To my knowledge [Xml Parser](http://us2.php.net/manual/en/book.xml.php) is an event based parser and thus well suited for large files.
Gordon
@Gordon, sorry I confused it with SimpleXML and DOMDocument which both load the entire document. I'll add Xml Parser has an other possible solution.
HoLyVieR
A: 

DOMDocument has lots of excellent methods for accessing, updating and outputting formatted XML. With regards to converting to an array, I'd suggest going for objects inside an array, which is something that PHP is perfectly fine with, and I find the syntax much clearer than arrays for keeping track of this kind of hierarchy.

 <?php


// load xml families, could split this into different files..
$families = new DOMDocument();
$families->load("/xml/families.xml"); // your xml file

$families_items = $families->getElementsByTagName("family");

$my_cool_array = null;  // reset this variable for being set as an array literal later

foreach( $families_items as $family_item) {

    $toinsert = null; // reset the object literal

    $toinsert->family_id = $family_item->getElementsByTagName('familyID')->nodeValue;
    $toinsert->familyDescription= $family_item->getElementsByTagName('familyDescription')->nodeValue;

    $children = $family_item->getElementsByTagName('childrenList')->childNodes;


    // children
    foreach ($children as $child) {
        $child_toinsert[]->name = $child->nodeValue;
    }
    // etc for your details, syntax might be a bit off, but should get you started

    $toinsert->children = $child_toinsert;


    // build array of objects
    $my_cool_array_of_families[] = $toinsert;



}


var_dump($my_cool_array);

Something like this, double check the syntax, but it's on the way ;)

danp
DOM is great but DOM will load the entire XML file into memory, which might not be feasible in the OP's case, because he has very large files.
Gordon