tags:

views:

41

answers:

1

if you search alexa with any URL's you will get a detailed traffic information of the same. what I am looking into is I would like to parse Visitors by Country info from alexa.

example for google.com

url is - http://www.alexa.com/siteinfo/google.com.

on the Audience tab you can see:

Visitors by Country for Google.com

United States 35.0%

India 8.8%

China 4.1%

Germany 3.4%

United Kingdom 3.2%

Brazil 3.2%

Iran 2.8%

Japan 2.1%

Russia 2.0%

Italy 1.9%

Brazil 3.2%

Iran 2.8%

Japan 2.1%

Russia 2.0%

Italy 1.9%

Indonesia 1.7% //etc.

How can I get only these info from alexa.com?? I have tried with preg_match function but it is very difficult in this case....

+1  A: 

If you don't want to use DOM and getElementById which is the most elegant solution in this case, you can try regexp:

$data = file_get_contents('http://www.alexa.com/siteinfo/google.com');
preg_match_all(
   '/<a href="\/topsites\/countries\/(.*)">(.*)<\/a>/mU',
   $data,
   $result,
   PREG_SET_ORDER
);

The DOM solution looks like:

$doc = new DomDocument;

$doc->loadHTMLFile('http://www.alexa.com/siteinfo/google.com');

$data = $doc->getElementById('visitors-by-country');

$my_data = $data->getElementsByTagName('div');

$countries = array();
foreach ($my_data as $node)
{
    foreach($node->getElementsByTagName('a') as $href)
    {
        preg_match('/([0-9\.\%]+)/',$node->nodeValue, $match);
        $countries[trim($href->nodeValue)] = $match[0]; 
    }
}    

var_dump($countries);
narcisradu
wow so short. great thank you man. but what about DOM how do work in this case??? just for the knowledge or if it is faster than this then definitely will use that...
mathew
well but percentage isn't coming
mathew
for percentage, the preg_match_all should be adjusted to match <div class="tr1 "> and what's inside.
narcisradu
well DOM works great...but how do I extract both values say country and percentage from this kind of array?? Array ( [Â United States] => 35.0% [Â India] => 8.8% [Â China] => 4.1% [Â Germany] => 3.4% ...
mathew
foreach ($countries as $country => $percent) echo $country . ' - ' . $percent; will display the country and the percent. It's just array manipulation now.
narcisradu
Ahhh I missed it...it works great. thanks narcisradu.
mathew