tags:

views:

85

answers:

3

I spent time on regex to solve this problem but not have result i try solve this problem using PHP 5.3 Information like - How many times repeats in page - How many times repeats in page and information about all tags in page

A: 

I suggest you checkout simple html dom

http://simplehtmldom.sourceforge.net/manual.htm

Lizard
Its a big package for 1 simple task :/
RobertPitt
+3  A: 

Your question is unfortunately barely understandable in it's current form. Please try to update it and be more specific. If you want to count all HTML tags in a page, you can do:

$HTML = <<< HTML
<html>
    <head>
        <title>Some Text</title>
    </head>
    <body>
        <p>Hello World<br/>
            <img src="earth.jpg" alt="picture of earth from space"/>
        <p>
        <p>Counting Elements is easy with DOM</p>
    </body>
</html>
HTML;

Counting all DOMElements with DOM:

$dom = new DOMDocument;
$dom->loadHTML($HTML);
$allElements = $dom->getElementsByTagName('*');
echo $allElements->length;

The above will output 8, because there is eight elements in the DOM. If you also need to know the distribution of elements, you can do

$elementDistribution = array();
foreach($allElements as $element) {
    if(array_key_exists($element->tagName, $elementDistribution)) {
        $elementDistribution[$element->tagName] += 1;
    } else {
        $elementDistribution[$element->tagName] = 1;
    }
}
print_r($elementDistribution);

This would return

Array (
    [html] => 1
    [head] => 1
    [title] => 1
    [body] => 1
    [p] => 2
    [br] => 1
    [img] => 1
)

Note that getElementsByTagName returns DOMElements only. It does not take into account closing tags, nor does it return other DOMNodes. If you also need to count closing tags and other node types, consider using XMLReader instead.

Gordon
A: 
$testHTML = file_get_contents('index.html');

$search = preg_match_all('/<([^\/!][a-z1-9]*)/i',$testHTML,$matches);

echo '<pre>';
var_dump($matches[1]);
echo '</pre>';

Gives you an array of all the tags. Once the data is in the array, you can use all the standard PHP array functions - e.g. array_count_values() - to extract the details you want... though you're not really saying what information you want about the html tags

Using array_count_values() with the results of the preg_match_all():

echo '<pre>';
var_dump(array_count_values($matches[1]));
echo '</pre>';

gives

array(5) {
  ["html"]=>
  int(1)
  ["head"]=>
  int(1)
  ["title"]=>
  int(1)
  ["body"]=>
  int(1)
  ["h1"]=>
  int(2)
}

Is this what you want?

Mark Baker
So why the markdown?
Mark Baker
information need like div - 5 a - 7 p - 22Maby DOMDocument is not best solve for this task ?
Alexandr
Will somebody please tell me why I am being marked down? If I've answered this incorrectly, it would be very useful for me to know how and why I've f***ed up
Mark Baker
Yes Thank you very much this is what I have so long tormentedIt great answer Thank you very much
Alexandr