views:

41

answers:

3

I think this should be elementary, but I still cant't get my head around it. Let's say there's fair amount of HTML documents and I need to catch every image urls out of them. The rest of the content changes, but the base of the url is always the same for example http://images.examplesite.com/images/,

so I wan't to extract every string that contains that part. the problem is that they're always mixed with or tags, so how could I drop them out? preg_match probably?

+1  A: 

Try something like: preg_match_all('/http:\/\/images\.examplesite\.com\/images\/(.*?)"/i', $html_data, $results, PREG_SET_ORDER)

narcisradu
wow, that was fast. it leaves one " after the string, but believe it or not I got rid of it myself ;D thanks again!
Seerumi
A: 

You can either use html dom parser

or use regular expression.

  preg_match_all("/http:\/\/images.examplesite.com\/images\/(.*?)\"/s", $str, $preg);
  print_r($preg);
marvin