views:

52

answers:

3

What I want is a function I can run on user input that will intelligently find and add the width and height attributes to any <img> tag in a blob of HTML so as to avoid page-reflowing issues while images load.

I am writing the posting script for a PHP forum, where a user's input is sanitised and generally made nicer before writing it to the database for later display. As an example of what I do to make things nicer, I have a script that inserts alt attributes into images like so:

Here are two images: <img src="http://example.com/image.png"&gt; <img src="http://example.com/image2.png"&gt;

which, upon sanitising by the posting script, becomes

Here are two images: <img src="http://example.com/image.png" alt="Posted image"> <img src="http://example.com/image2.png" alt="Posted image">

(This makes it validate under HTML 4 strict, but maybe isn't in the spirit of the alt attribute—alas!)

So, for my function, I have a vague idea that the server will need to run getimagesize() on each external image it finds in the block of HTML, then apply the attributes that function generates to each and every <img> tag it runs into. I assume that this function has been written before, but I have had no luck on Google or php.net docs. Do I have to start from scratch, or is somebody aware of a (relatively) robust function that I can use or adapt to do this job?

+3  A: 

You're right about getimagesize(). You can simply do something like this:

$img = 'image2.png';
$info = getimagesize($img);
printf('<img src="%s" %s>', $img, $info[3]);

If the image is hosted at a remote location, you'll have to download all the images though (the function takes care of it), so you might want to cache the result to speed things up on subsequent requests.

Edit: Just saw that you have a string containing various <img> elements. This should do the trick:

<?php
$html = <<<EOF
something <img src="https://www.google.com/images/logos/ssl_logo_lg.gif"&gt; hello <img src="https://mail.google.com/mail/images/2/5/logo1.png"&gt;
EOF;

$dom = new DOMDocument();
$dom->loadHTML($html);

foreach ($dom->getElementsByTagName('img') as $img) {
    list($width, $height) = getimagesize($img->getAttribute('src'));
    $img->setAttribute('width', $width);
    $img->setAttribute('height', $height);
}

$xpath = new DOMXpath($dom);
$newDom = new DOMDocument();
foreach ($xpath->query('//body/p')->item(0)->childNodes as $node) {
    $newDom->appendChild($newDom->importNode($node, true));
}

$newHtml = $newDom->saveHTML();
?>
Daniel Egeberg
actually, caching the results is a good idea regardless of the where the images are stored.
Gordon
I'm very grateful for you walking through the DOM process, as it was what scared me. I will try using this code now.
I'm slowly implementing this, and am currently puzzling through an if() check that ensures that I don't end up with `width="" height=""` if PHP fails to retrieve the image.I've also noticed that if my HTML string is simply `<img ...> <img ...> <img ...> ` nothing is returned in $newHTML. I will work around that in my awful way by inserting a dummy word at the front and end of the string and strip it out after processing.This is wonderful, though. Thank you.
A: 

The problem is that you are requiring the server to do an awful lot of work up front. I suspect it would be a lot more efficient to populate a database of sizes offline (or maintain a cache of sizes).

And having done this, you could push the work out to the browser by using a cacheable javascript which sets the image sizes and is called inline at the end of the html (which has the advantage the you don't need to push all the html through you PHP code for rewriting). Hint: iterate through document.images[]

HTH

C.

symcbean
This is not a bad idea as an alternative. It would not be the end of the world if somebody circumvented this javascript, for instance, so it's not critical to have PHP do it.
A: 

you can use getimagesize() but it would be smart to store this information once and reuse it, or at least cache aggressively (as it's unlikely to change often) or your server will crawl to a halt with higher load

AlliXSenoS
Is caching really an issue, when PHP only has to do the heavy lifting of downloading a remote image and examining it upon a user posting an image to the site? It will not have to be called on every page load, as it will change the HTML written to the database that is retrieved and inserted verbatim in the thread display script.Images may only be posted a handful of times a day.