tags:

views:

99

answers:

3

How can I fix this?

REGEX:
//REGEX
$match_expression = '/Rt..tt<\/td> <td>(.*)<\/td>/';
preg_match($match_expression,$text,$matches1);
$final =  $matches1[1];       


//THIS IS WORKING
<tr> <td class="rowhead vtop">Rtštt</td> <td><img border=0 src="http://somephoto"&gt;&lt;br /> <br />INFO INFO INFO</td>
</tr> 


//THIS IS NOT WORKING
<tr> <td class="rowhead vtop">Rtštt</td> <td> <br />
IFNO<br />
INFO<br /></td></tr>
+2  A: 

You're doing it wrong!

Having said that, a solution to your question is:

/Rt..tt<\/td> <td>(.*)<\/td>/

should be

/Rt..tt<\/td> <td>(.*)<\/td>/s

see http://php.net/manual/en/reference.pcre.pattern.modifiers.php

Turtle
+5  A: 

And this is exactly why you shouldn't be using Regular Expressions to extract data from an HTML document.

The markup structure is so arbitrary that it is simply too unreliable, which is exactly why I won't give you a proper regular expression to use because there is none (the solutions given by other users might work... until they break). Use a DOM Parser like DOMDocument or phpQuery to extract data from your document.

Here is an example using phpQuery:

$pq = phpQuery::newDocumentFile('somefile.html');
$rows = $pq->find('td.rowhead.vtop:parent');

$matches = array();

foreach($rows as $row) {
  $matches[] = $row->eq(1)->html();
}
Andrew Moore
A: 
$s = explode('</tr>',$str);
foreach($s as $v){
 $m=strpos($v,"img border");
 if($m!==FALSE){
    print substr($v,$m);
 }
}
ghostdog74