views:

20

answers:

1

URL:

http://en.wikipedia.org/w/api.php?action=parse&prop=text&page=Lost_(TV_series)&format=xml

this outputs something like:

<api><parse><text xml:space="preserve">text...</text></parse></api>

how do i get just the content between <text xml:space="preserve"> and </text>?

i used curl to fetch all the content from this url. so this gives me:

$html = curl_exec($curl_handle);

whats the next step?

+1  A: 

Use PHP DOM to parse it. Do it like this:

//you already have input text in $html
$html = '<api><parse><text xml:space="preserve">text...</text></parse></api>';

//parsing begins here:
$doc = new DOMDocument();
@$doc->loadHTML($html);
$nodes = $doc->getElementsByTagName('text');

//display what you need:
echo $nodes->item(0)->nodeValue;

This outputs:

text...

shamittomar