tags:

views:

58

answers:

2

Hi,

I'm trying to wrap my head around PHP and XML.

I'm trying to do something:

There is an XML document that I'm retrieving via cURL (also tried various PHP XML library parameters such as XMLReader::open($url) etc. The method of retrieval doesn't matter; I can and have got this part working.

The problem is parsing the XML on the retrieved page.

Here is an example of the XML:

http://z3950.loc.gov:7090/voyager?version=1.1&operation=searchRetrieve&query=9780471615156&maximumRecords=1&recordPacking=xml&recordSchema=marcxml

What I need to get from that page is the call number;

<datafield tag="060" ind1=" " ind2=" ">
  <subfield code="a">WM 173.6 R823m</subfield>
</datafield>

author;

<datafield tag="100" ind1="1" ind2=" ">
  <subfield code="a">Ross, Colin A.</subfield>
</datafield>

and title information;

<datafield tag="245" ind1="1" ind2="0">
  <subfield code="a">Multiple personality disorder :</subfield>
  <subfield code="b">diagnosis, clinical features, and treatment /</subfield>
  <subfield code="c">Colin A. Ross.</subfield>
</datafield>

seems simple enough. However, for the life of me I can not seem to get any of the inbuilt PHP functions for working with XML to work (because I'm doing it wrong).

Here is an example I've tried:

//xml file retrieved via curl and saved to folder
$file="9780471615156.xml";

$xml = simplexml_load_file($file);

echo $xml->getName();//returns searchRetrieveResponse

foreach($xml->searchRetrieveResponse[0]->attributes() as $a => $b){
  echo $a,'="',$b,"\"</br>";//nothing
 }

foreach ($xml->searchRetrieveResponse[0]->children() as $child){
  echo "Child node: " . $child . "<br />";//nothing
}

it returns the name of the first node, but I can't get it to go any deeper.

Any tips would be greatly appreciated.

NB: I'm running PHP 5+

+1  A: 

Hi, as far as I tried the simpleXML can not read this XML. Try the example below, it will list an array which you can easily loop trought and find what you need simply by comparing keys/values you're looking for.

// load XML into string here
// $string = ????;
$xml_parser = xml_parser_create();
xml_parse_into_struct($xml_parser, $string, $object, $index);

echo '<pre>';
print_r($object);
// print_r($index);
echo '</pre>';
dwich
Exactly what I needed. HUGE thanks!
stormdrain
@stormdrain: My pleasure :) Enjoy
dwich
+2  A: 

There's probably nothing wrong with xml_parse_into_struct(). But since it has been stated that this can't be done with SimpleXML:

<?php 
$file="http://z3950.loc.gov:7090/voyager?version=1.1&amp;operation=searchRetrieve&amp;query=9780471615156&amp;maximumRecords=1&amp;recordPacking=xml&amp;recordSchema=marcxml";
$xml = simplexml_load_file($file);
$xml->registerXPathNamespace('foo', 'http://www.loc.gov/MARC21/slim');

foreach( $xml->xpath('//foo:record') as $record ) {
  echo "record: \n";
  $record->registerXPathNamespace('foo', 'http://www.loc.gov/MARC21/slim');
  foreach( $record->xpath('foo:datafield[@tag="060" or @tag="100" or @tag="245"]') as $datafield ) {
    switch($datafield['tag']) {
      case '060':
        echo "  call number: \n";
        break;
      case '100':
        echo "author: \n";
        break;
      case '245':
        echo "title : \n";
        break;
    }
    $datafield->registerXPathNamespace('foo', 'http://www.loc.gov/MARC21/slim');
    foreach( $datafield->xpath('foo:subfield') as $sf ) {
      echo '   ', $sf['code'] . ': ' . $sf . "\n";
    }    
  }
}

prints

record: 
  call number: 
   a: WM 173.6 R823m
author: 
   a: Ross, Colin A.
title : 
   a: Multiple personality disorder :
   b: diagnosis, clinical features, and treatment /
   c: Colin A. Ross.

It's a bit annoying that you have to register the namespace again and again for each subsequent SimpleXMLElement ...but anyway it works and it uses SimpleXML ;-)

see also: http://docs.php.net/simplexmlelement.registerXPathNamespace and http://www.w3.org/TR/xpath/

VolkerK