tags:

views:

42

answers:

4

I need read in and parse data from a third party website which sends XML data. All of this needs to be done server side.

What is the best way to do this using PHP?

+3  A: 

You can obtain the remote XML data with, e.g.

$xmldata = file_get_contents("http://www.example.com/xmldata");

or with curl. Then use SimpleXML, DOM, whatever.

Artefacto
If `allow_url_fopen` is enabled, you can load the XML directly with `simplexml_load_file()` or `DOM::load()`.
Gordon
A: 

I have been using simpleXML for a while.

sushil bharwani
@Steven1350: I think this is simple enough, to retrieve and parse the data. It's easy to use and probably offers the features you need :)
faileN
+1  A: 

A good way of parsing XML is often to use XPP (XML Pull Parsing) librairy, PHP has an implementation of it, it's called XMLReader.

http://php.net/manual/en/class.xmlreader.php

HoLyVieR
I'd say it's the most memory efficient. As to "effective"... it's certainly not the simplest.
Artefacto
it is indeed subjective, I edited.
HoLyVieR
+1  A: 

I would suggest you to use DOMDocument (PHP inline built class) A simple example of its power could be the following code:

   /***********************************************************************************************
   Takes the RSS news feeds found at $url and prints them as HTML code.
   Each news is rendered in a <div class="rss"> block in the order: date + title + description. 
   ***********************************************************************************************/
   function Render($url, $max_feeds = 1000)
   {   
      $doc = new DOMDocument();

      if(@$doc->load($url, LIBXML_NOCDATA|LIBXML_NOBLANKS))
      {
         $feed_count = 0;
         $items = $doc->getElementsByTagName("item");
         //echo $items->length; //DEBUG
         foreach($items as $item)
         {              
                if($feed_count > $max_feeds)
                   break;

                //Unfortunately inside <item> node elements are not always in same order, therefor we have to call many times getElementsByTagName
                //WARNING: using iconv function instead of utf8_decode because this last one did not convert properly some characters like apostrophe 0x19 from techsport.it feeds.
                $title = iconv('UTF-8', 'CP1252', $item->getElementsByTagName("title")->item(0)->firstChild->textContent); //can use "CP1252//TRANSLIT"
                $description = iconv('UTF-8', 'CP1252', $item->getElementsByTagName("description")->item(0)->firstChild->textContent); //can use "CP1252//TRANSLIT"
                $link = iconv('UTF-8', 'CP1252', $item->getElementsByTagName("link")->item(0)->firstChild->textContent); //can use "CP1252//TRANSLIT"

                //pubDate tag is not mandatory in RSS [RSS2 spec: http://cyber.law.harvard.edu/rss/rss.html]
                $pub_date = $item->getElementsByTagName("pubDate"); $date_html = "";
                //play with date here if you want

                echo "<div class='rss'>\n<p class='title'><a href='" . $link . "'>" . $title . "</a></p>\n<p class='description'>" . $description . "</p>\n</div>\n\n";

                $feed_count++;
        }
      }
      else
         echo "<div class='rss'>Service not available.</div>";
   }
Marco Demajo