views:

1775

answers:

7

Hiya,

I've consistently had an issue with parsing XML with PHP and not really found "the right way" or at least a standardised way of parsing XML files.

Firstly i'm trying to parse this:

  <item> 
     <title>2884400</title> 
     <description><![CDATA[ ><img width="126" alt="" src="http://userserve-ak.last.fm/serve/126/27319921.jpg" /> ]]></description> 
     <link>http://www.last.fm/music/+noredirect/Beatles/+images/27319921&lt;/link&gt; 
     <author>anne710</author> 
     <pubDate>Tue, 21 Apr 2009 16:12:31 +0000</pubDate> 
     <guid>http://www.last.fm/music/+noredirect/Beatles/+images/27319921&lt;/guid&gt; 
     <media:content url="http://userserve-ak.last.fm/serve/_/27319921/Beatles+2884400.jpg" fileSize="13065" type="image/jpeg" expression="full"  width="126" height="126" /> 
     <media:thumbnail url="http://userserve-ak.last.fm/serve/126/27319921.jpg" type="image/jpeg" width="126" height="126" /> 
  </item>

I'm using this code:

$doc = new DOMDocument();
$doc->load('http://ws.audioscrobbler.com/2.0/artist/beatles/images.rss');
$arrFeeds = array();
foreach ($doc->getElementsByTagName('item') as $node) {
 $itemRSS = array ( 
  'title' => $node->getElementsByTagName('title')->item(0)->nodeValue,
  'desc' => $node->getElementsByTagName('description')->item(0)->nodeValue,
  'link' => $node->getElementsByTagName('link')->item(0)->nodeValue,
  'date' => $node->getElementsByTagName('pubDate')->item(0)->nodeValue
  );
 array_push($arrFeeds, $itemRSS);
}

Now I want to get the "media:content" and "media:thumbnail" url attributes, how would i do that? Now i think i should be using DOMElement::getAttribute but i haven't managed to get it to work :/ Can anyone shed some light on this, and also let me know if this is a good way to parse XML?

Regards, Shadi

A: 

You would want something like this:

'content' => $node->getElementsByTagNameNS('http://search.yahoo.com/mrss/', 'content')->item(0)->getAttribute('url');
'thumbnail' => $node->getElementsByTagNameNS('http://search.yahoo.com/mrss/', 'thumbnail')->item(0)->getAttribute('url');

I believe that will work, it's been a while since I've done anything like this.

Craig Martek
<rss version="2.0" xmlns:creativeCommons="http://backend.userland.com/creativeCommonsRssModule" xmlns:media="http://search.yahoo.com/mrss/"> so how do implement that?!
Shadi Almosri
Did this not work?
Craig Martek
[Mon Jul 13 23:13:04 2009] [error] [client xxx.xxx.xxx.xxx] PHP Fatal error: Call to a member function getAttribute() on a non-object in /v2.php on line 73
Shadi Almosri
Try putting an http:// in front of the namespace.
Craig Martek
This is a good solution, there is only one confusing thing;the getElementsByTagNameNS is usually not related to $node (which is part of an iteration) but it's related to the XML's Document Root, to the main DOM object. If variable `$xml = new DOMDocument();` then this is how it will work: `$content = $xml->getElementsByTagNameNS('http://search.yahoo.com/mrss/', 'content')->item($i);`
Tamas Kalman
A: 
<?php

#Convert the String Into XML
$xml = new SimpleXMLElement($_POST['name']);

#Itterate through the XML for the data 

$values = "VALUES('' , ";
foreach($xml->item as $item)
{
 //you now have access to that aitem
}

?>
PSU_Kardi
hmmm, this hasn't really worked, i tried to place the url instead of $_POST but it doesn't get the file, i got the file into a variable and passed it into the simplexmlelement but it still didn't have anything inside $item.
Shadi Almosri
That was actually part of a code snippet from my code.I should mention that you need to change $xml->item as it pertains to the xml feed that you are getting.I would look at the SimpleXMLElement documentation - but that's what I use to work with XML that I send from Adobe Flex
PSU_Kardi
A: 

Try using SimpleXML: http://us2.php.net/simplexml

ctshryock
running the data through simplexml doesn't seem to help, it doesn't pick up any of the <media:content and <media:thumbnail content, just the rest
Shadi Almosri
I suggested SimpleXML as well
PSU_Kardi
+2  A: 

You can use SimpleXML as suggested by the other posters, but you need to use the children() and attributes() functions so you can deal with the different namespaces

Example (untested):

$feed = file_get_contents('http://ws.audioscrobbler.com/2.0/artist/beatles/images.rss');
$xml = new SimpleXMLElement($feed);
foreach ($xml->channel->item as $item) {
    foreach ($item->children('http://search.yahoo.com/mrss' as $media_element) {
        var_dump($media_element);
    }
}

Alternatively, you can use XPath (again, untested):

$feed = file_get_contents('http://ws.audioscrobbler.com/2.0/artist/beatles/images.rss');
$xml = new SimpleXMLElement($feed);
$xml->registerXPathNamespace('media', 'http://ws.audioscrobbler.com/2.0/artist/beatles/images.rss');
$images = $xml->xpath('/rss/channel/item/media:content@url');
var_dump($images);
Sander Marechal
A: 

This was how i have eventually done it using XMLReader:

<?php

define ('XMLFILE', 'http://ws.audioscrobbler.com/2.0/artist/vasco%20rossi/images.rss');
echo "<pre>";

$items = array ();
$i = 0;

$xmlReader = new XMLReader();
$xmlReader->open(XMLFILE, null, LIBXML_NOBLANKS);

$isParserActive = false;
$simpleNodeTypes = array ("title", "description", "media:title", "link", "author", "pubDate", "guid");

while ($xmlReader->read ())
{
    $nodeType = $xmlReader->nodeType;

    // Only deal with Beginning/Ending Tags
    if ($nodeType != XMLReader::ELEMENT && $nodeType != XMLReader::END_ELEMENT) { continue; }
    else if ($xmlReader->name == "item") {
        if (($nodeType == XMLReader::END_ELEMENT) && $isParserActive) { $i++; }
        $isParserActive = ($nodeType != XMLReader::END_ELEMENT);
    }

    if (!$isParserActive || $nodeType == XMLReader::END_ELEMENT) { continue; }

    $name = $xmlReader->name;

    if (in_array ($name, $simpleNodeTypes)) {
        // Skip to the text node
        $xmlReader->read ();
        $items[$i][$name] = $xmlReader->value;
    } else if ($name == "media:thumbnail") {
        $items[$i]['media:thumbnail'] = array (
                "url" => $xmlReader->getAttribute("url"),
                "width" => $xmlReader->getAttribute("width"),
                "height" => $xmlReader->getAttribute("height"),
       "type" => $xmlReader->getAttribute("type")
        );
    } else if ($name == "media:content") {
        $items[$i]['media:content'] = array (
                "url" => $xmlReader->getAttribute("url"),
                "width" => $xmlReader->getAttribute("width"),
                "height" => $xmlReader->getAttribute("height"),
       "filesize" => $xmlReader->getAttribute("fileSize"),
       "expression" => $xmlReader->getAttribute("expression")
        );
    }
}

print_r($items);
echo "</pre>";

?>
Shadi Almosri
A: 

Try this. It'll work fine.

$doc = new DOMDocument(); $doc->load('http://ws.audioscrobbler.com/2.0/artist/beatles/images.rss'); $arrFeeds = array(); foreach ($doc->getElementsByTagName('item') as $node) { $itemRSS = array ( 'title' => $node->getElementsByTagName('title')->item(0)->nodeValue, 'desc' => $node->getElementsByTagName('description')->item(0)->nodeValue, 'link' => $node->getElementsByTagName('link')->item(0)->nodeValue, 'date' => $node->getElementsByTagName('pubDate')->item(0)->nodeValue, 'thumbnail' => $node->getElementsByTagName('thumbnail')->item(0)->getAttribute('url') ); array_push($arrFeeds, $itemRSS); }

Helder Robalo
A: 

come si fa alla fine?

neimspeis