tags:

views:

37

answers:

2

So, I'm trying to create a function which does the following:

  • Receive HTML in a string
  • Find file paths (in SRC= attributes)
  • Only replace image urls starting with the domain example.com
  • Only replace images (jpg, jpeg, gif, png)
  • Replace the domain of the image with example2.com
  • Return HTML with image URLs replaced.

Is there an elegant way of doing that? I've been struggling to create a regular expression to take care of it me, and have only met with epic fail so far.

Any help greatly appreciated.

+1  A: 

The PHP Simple HTML DOM Parser should do the trick for that.

Sarfraz
I have a feeling ircmaxell's comment was more comprehensive; but the Simple HTML dom parser was documented in a way that was easier to use for me. Thank you!
palmaceous
+2  A: 

First off, don't use regex to parse HTML...

Here's a basic example using XPath and DomDocument:

$dom = new DomDocument();
$dom->loadHtml($html);
$xpath = new DomXpath($dom);

$query = '//img[contains(@src, "example.com")]';
$imgs = $xpath->query($query);

foreach ($imgs as $img) {
    $src = $img->getAttribute('src');
    $parts = explode('.', $src);
    $extension = strtolower(end($parts));
    if (in_array($extension, array('jpg', 'jpeg', 'gif', 'png'))) {
        $src = str_replace('example.com', 'example2.com', $src);
        $img->setAttribute('src', $src);
    }
}

$html = $dom->saveHtml(); 
ircmaxell
That... Just opened a whole new world. Thank you, I'll give it a shot now!
palmaceous