tags:

views:

49

answers:

1

I`m writing a PHP page that parses given URL. What I can do is find the first occurrence only, yet when I echo it, I get another value rather than the given.

this is what I did till now.

<?php
$URL = @"my URL goes here";//get from database
$str = file_get_contents($URL);
$toFind = "string to find";
$pos = strpos(htmlspecialchars($str),$toFind);
echo substr($str,$pos,strlen($toFind)) . "<br />";
$offset = $offset + strlen($toFind);
?>

I know that a loop can be used, yet I don`t know the condition neither the body of the loop would be.

And how can I show the output I need??

+5  A: 

This happens because you are using strpos on the htmlspecialchars($str) but you are using substr on $str.

htmlspecialchars() converts special characters to HTML entities. Take a small example:

// search 'foo' in '&foobar'

$str = "&foobar";
$toFind = "foo";

// htmlspecialchars($str) gives you "&amp;foobar"
// as & is replaced by &amp;. strpos returns 5
$pos = strpos(htmlspecialchars($str),$toFind);

// now your try and extract 3 char starting at index 5!!! in the original
// string even though its 'foo' starts at index 1.
echo substr($str,$pos,strlen($toFind)); // prints ar

To fix this use the same haystack in both the functions.

To answer you other question of finding all the occurrences of one string in other, you can make use of the 3rd argument of strpos, offset, which specifies where to search from. Example:

$str = "&foobar&foobaz";
$toFind = "foo";
$start = 0;
while( ($pos = strpos(($str),$toFind,$start) !== false)) {
        echo 'Found '.$toFind.' at position '.$pos."\n";
        $start = $pos+1; // start searching from next position.
}

Output:

Found foo at position 1
Found foo at position 8

codaddict
To elaborate a little, htmlspecialchars replaces some characters with several new ones, which means you can't use the same indexes in both strings and expect the same result.
dutt
@codaddict: This solved one problem which is the output. But what about the other problem?
sikas
@codaddict: That is what I was looking for :D ... thank you.
sikas