views:

315

answers:

6

I have to process loads of images. First, I need to check if the size of the image is greater than 50x60 and appropriately increasing the counter of bad images.

The problem I have is that the speed of n.width / n.height on Internet Explorer 8 is extremely low. I checked n.offsetWidth, n.clientWidth but they are all the same speed-wise. I cannot use n.style.width though, because this value is not always set on the <img /> tags that I'm interested in.

Consider following code:

Javascript

var Test = {
    processImages: function () {
        var fS = new Date().getTime();

        var minimagew = 50,
            minimageh = 60;
        var imgs = document.getElementsByTagName('img');
        var len = imgs.length,
            isBad = 0,
            i = len;

        while (i--) {
            var n = imgs[i];

            var imgW = n.width;
            var imgH = n.height;

            if (imgW < minimagew || imgH < minimageh) {
                isBad++;
            }
        }

        var fE = new Date().getTime();
        var fD = (fE - fS);

        console.info('Processed ' + imgs.length + ' images in ' 
                     + fD + 'ms.  ' + isBad + ' were marked as bad.');
    }
};

HTML

<img src="http://nsidc.org/images/logo_nasa_42x35.gif" />
   [snip 9998 images]
<img src="http://nsidc.org/images/logo_nasa_42x35.gif" />

Code produces following output parsing 10k images (3 different Ctrl+F5s)

  • FF: Processed 10000 images in 115ms. 10000 were marked as bad.
  • FF: Processed 10000 images in 99ms. 10000 were marked as bad.
  • FF: Processed 10000 images in 87ms. 10000 were marked as bad.
  • IE8: Processed 10000 images in 206ms. 10000 were marked as bad.
  • IE8: Processed 10000 images in 204ms. 10000 were marked as bad.
  • IE8: Processed 10000 images in 208ms. 10000 were marked as bad.

As you can see the code in FF 3.6 is twice faster than the code executed in IE8.

To prove that the issue is really related to the speed of browser dimension property, if I change: n.width and n.height to constants, so we'll have:

 var imgW = 43;
 var imgH = 29;

I get following results:

  • FF: Processed 10000 images in 38ms. 10000 were marked as bad.
  • FF: Processed 10000 images in 34ms. 10000 were marked as bad.
  • FF: Processed 10000 images in 33ms. 10000 were marked as bad.
  • IE8: Processed 10000 images in 18ms. 10000 were marked as bad.
  • IE8: Processed 10000 images in 22ms. 10000 were marked as bad.
  • IE8: Processed 10000 images in 17ms. 10000 were marked as bad.

That's right! When we skip <img /> dimension check (call to node.width / node.clientWidth etc), IE8 actually performs better than Firefox.

Do you have any ideas why does it take so long for IE to check the size of the image and eventually how to improve the performance of this check?

+7  A: 

Well your code is pretty basic. The only thing you can optimize is how you check dimensions:

if (n.width < minimagew || n.height < minimageh) {
  isBad++;
}

In this way if the width of an image is wrong, the height won't be accessed. It will make your code 1.5-2x faster for images with bad width.

But my guess is that you don't actually need 10 000 images as part of your website. In this case you can do your check on Image objects instead of <img> elements.

loop {
  var img = new Image();
  img.src = "http://nsidc.org/images/logo_nasa_42x35.gif";
}

This will make your code 2x faster in IE 8 and 10x faster in FF.

Making these changes gave the following improvements on my computer (demo):

FF: 200 ms ->  7 ms
IE:  80 ms -> 20 ms
galambalazs
Hi, thanks for your answer. Like I said, I do need to parse loads of images. Images that are genuine illustrations, icons, emoticons, invisible-pixels - basically everything on the page that is an `<img />`. So I must check the size to determine whether or not the image is big enough to consider it as an 'ilustration'. The url to nasa logo I gave is just an example, normally all images would be different. Unfortunately your code is not what I'm looking for.
rochal
There is only one thing you can do to speed up a primitive operation like this: http://code.google.com/chrome/chromeframe/
galambalazs
+1  A: 

Hi, I was very interested in your question but unfortunately I didn't really get anywhere in optimizing your code. I was able to trim off about 30 to 40 ms on the IE execution (this is obviously dependent on the power of your physical machine). But I tried just about everything

Things I tried instead of [element].width.

  • [element].getBoundingClientRect() - basically this returns height and width in one
  • document.elementFromPoint(x, y) - I though by using offsetLeft + 50 and offsetTop + 60 I could determine if the element at that point was different than the current element, meaning that it was a "bad" image.

In the end this is what I came up with to trim just a bit off the time (30 to 40 ms).

Note: the best time I got in IE 8 was 171ms

Edited - I modified the code below to include your way, my way and using Jquery. Test it out.

<html>
<script src="http://code.jquery.com/jquery-1.4.2.js" type="text/javascript"></script>
<script type="text/javascript">
var TestYourWay = {
    processImages: function () {
        var fS = new Date().getTime();

        var minimagew = 50,
            minimageh = 60;
        var imgs = document.getElementsByTagName('img');
        var len = imgs.length,
            isBad = 0,
            i = len;

        while (i--) {
            var n = imgs[i];

            var imgW = n.width;
            var imgH = n.height;

            if (imgW < minimagew || imgH < minimageh) {
                isBad++;
            }
        }

        var fE = new Date().getTime();
        var fD = (fE - fS);

        alert('Processed ' + imgs.length + ' images in '
                     + fD + 'ms.  ' + isBad + ' were marked as bad.');
    }
};


var TestMyWay = {
    processImages: function () {
        var fS = new Date(),
        imgs = document.getElementsByTagName('img'),
        isBad = 0;
        for (var i = 0, img; img = imgs[i]; i++) {
            if (img.width  < 50 || img.height < 60) {
                isBad++;
            }
        }
        var fD = new Date() - fS;
        alert('Processed ' + i + ' images in ' + fD + 'ms.  ' + isBad + ' were marked as bad.');
    }
};

var TestJquery = {
    processImages: function () {
        var fS = new Date(),
        imgs = $('img'),
        isBad = 0;
        imgs.each(function () {

           if (this.width  < 50 || this.height < 60) {
                isBad++;
            }
        });
        var fD = new Date() - fS;
        alert('Processed ' + imgs.length + ' images in ' + fD + 'ms.  ' + isBad + ' were marked as bad.');
    }
};

</script>
<body>
     <button onclick="javascript:TestYourWay.processImages();" id="yourWay">Your Way</button>
     <button onclick="javascript:TestMyWay.processImages();" id="myWay">My Way</button>
     <button onclick="javascript:TestJquery.processImages();" id="myWay">jQuery Way</button>
     <img src="http://nsidc.org/images/logo_nasa_42x35.gif" />
     <!--Copy This image tag 10000 times -->
</body>
</html>

Other things of Note:

The JavaScript engine in IE 8 is not as fast a FireFox 3.6+, Safari or Chrome. Opera has made improvements to their scripting engine but still not as fast as FF, Safari or Chrome. However Opera does out preform IE 8 in some things but is sluggish in others. Also of Note IE 9 is due out late this year or early next year and they have made improvements to the JavaScript Engine. You can see some Statistics on what I am saying here.

https://spreadsheets.google.com/pub?key=0AuWerG7Xqt-8dHBuU2pGMncwTENNNGlvNzFtaE5uX0E&amp;hl=en&amp;output=html

John Hartsock
@rochal ... I modified my answer with code for an html page that will run your way from your original question. A way I devised using pure javascript and then also a new way using Jquery. Test it out.
John Hartsock
I think this particular issue isn't so much about the JS engine as it is about the DOM rendering engine (especially since the OP points out that without the DOM hits, the JS runs *faster* in IE). DOM operations, even just reading positions/sizes, are notoriously slow in all browsers. If the functions you suggested do make it run faster, it may indeed be batching DOM requests rather than not doing so, which is an improvement.
Ken
Opera is faster than FF3.6 and still FF4 AFAIK. It's about on par with Safari.
alpha123
+1  A: 

Well this is most likely not what you're looking for, but I though I'd post it in case it helps someone else. Since there's no way to improve the speed of the browser's basic functionality, you can prevent the loop from freezing the browser while it is executing. You can do this by performing your loop in chunks and initiate the next chunk using setTimeout with a time of 0. This basically allows the browser to repaint, and perform other actions before it calls the next chunk. Here's a modified version of your script:

var Test = {
    processImages: function() {
        var fS = new Date().getTime();

        var minimagew = 50,
            minimageh = 60,
            stepSize = 1000;
        var imgs = document.getElementsByTagName('img');
        var len = imgs.length,
            isBad = 0,
            i = len,
            stopAt = len;

        var doStep = function() {
            stopAt -= stepSize;
            while (i >= stopAt && i--) {
                var n = imgs[i];

                var imgW = n.width;
                var imgH = n.height;

                if (imgW < minimagew || imgH < minimageh) {
                    isBad++;
                }
            }

            if (i > 0)
                setTimeout(doStep, 0);
            else {
                var fE = new Date().getTime();
                var fD = (fE - fS);

                console.info('Processed ' + imgs.length + ' images in '
                     + fD + 'ms.  ' + isBad + ' were marked as bad.');
            }
        }
        doStep();
    }
};

Of course this makes the total execution time longer, but maybe you can use it so that your page is usable while it is working.

InvisibleBacon
Thanks for your answer. The reason why I accepted your answer over other higher-voter answers, is because your solution will actually handle the problem gracefully. It was interesting to see other ideas, but instead of philosophical answers what I really needed is the code that will lead me to nice, actual solution. I think using timeouts and introducing 'buffering' to do only part of the job at once maybe won't speed up the entire process but it will definitely improve the experience of the user. Unfortunately, IE sucks and if we can't improve basic operations, we must find way around it.
rochal
@rochal: Better yet: Use the right tool for the job. If you had told us more details about what you're trying to do (I asked in a comment to your question) we would have given you different answers.
some
@some, agreed. On the other hand it is one skill to provide the code, it is another skill to "guess what client wants" and give him something better. I'm not disapproving your posts (I found them very interesting in fact) I'm marking this as an answer because it will deal with my problem. I should probably edit my question to explain why this answers suits me best.
rochal
@rochal: It's your question, you choose the answer. You made your point in the first comment to this answer why it was your choice. About guessing what the client wants and give him something better: I've asked you three times now, for more details of what you are trying to do, because I think there are much better tools out there, but without a little more information it's hard to give you an answer. To start with: What pages have 10 000 on a single page? And how can 200ms of processing be a problem, when the time it takes to transfer the files from the server to the client take seconds?
some
A: 

I haven't tested this on IE but, what you could do is remove the variable declarations from the loop, even though they're not that expensive (CPU-wise) they can spare a few CPU cycles when they're removed from a loop, also you can go ahead and skip the imgW and imgH assignments and access the objects properties directly, cause this can save one object de-referencing since the other part of the logic check in the if statement wouldn't have to be executed for the images with faulty widths;

var Test = {
    processImages: function () {
        var fS = new Date().getTime();

        var minimagew = 50,
            minimageh = 60;
        var imgs = document.getElementsByTagName('img');
        var len = imgs.length,
            isBad = 0,
            i = len;

        /* this is the only part updated */
        while (i--) {
            if (imgs[i].width < minimagew || imgs[i].height < minimageh) {
                isBad++;
            }
        }
        /* ... 'till here */

        var fE = new Date().getTime();
        var fD = (fE - fS);

        console.info('Updated:  Processed ' + imgs.length + ' images in ' 
                     + fD + 'ms.  ' + isBad + ' were marked as bad.');
    }
    ,processImagesOriginal: function () { // for comparisson
        var fS = new Date().getTime();

        var minimagew = 50,
            minimageh = 60;
        var imgs = document.getElementsByTagName('img');
        var len = imgs.length,
            isBad = 0,
            i = len;

        while (i--) {
            var n = imgs[i];

            var imgW = n.width;
            var imgH = n.height;

            if (imgW < minimagew || imgH < minimageh) {
                isBad++;
            }
        }

        var fE = new Date().getTime();
        var fD = (fE - fS);

        console.info('Original: Processed ' + imgs.length + ' images in ' 
                     + fD + 'ms.  ' + isBad + ' were marked as bad.');
    }
};

//Original: Processed 10000 images in ~38ms. 10000 were marked as bad.
//Updated:  Processed 10000 images in ~23ms. 10000 were marked as bad.
ArtBIT
If memory serves, JavaScript doesn't support the notion of blocked-scoped variables. Essentially, if it's in the for-loop, it's visible to the entire function, but the parser finds it when it first parses the function. It's one of the reasons in favor of declaring your variables closer to the point where they are first used in JavaScript. In short, moving the variable declaration out of the block would make exactly zero difference at all.
Mike Hofer
In theory yes, but this little test proved me that there is a slight difference http://stackoverflow.com/questions/3684923/javascript-variables-declare-outside-or-inside-loop/3685188#3685188 , also having no block-scope is just one reason more to declare the vars outside the loop and clear any confusion of the variable's scope imho
ArtBIT
A: 

Access an layout attribute of an element on screen (or under document.body) may invoke layout overhead (lock, reflow, ...etc) even if you do read-only access in IE.

I didn't test but you can try something like this.

    var done=i;
    while (i--) { 
        var img=new Image();
        img.onload=function(){
           if (this.width < minimagew || this.height < minimageh) { 
              isBad++;
           } 
           if(done--==0){ onComplete(); }
        };
        img.onerror=function(){ done--; }
        img.src=imgs[i].src;
    }

Don't use long loop! which make GUI slow

EDIT: document.getElementsByTagName is slower then you expected. It does not return a static array but a dynamic object which reflect changes(if any). You can try to copy elements in an array, which may reduce performance drop by interference when other DOM API.

Dennis Cheung
Just curious, how would `this` not refer to the Image object?
ArtBIT
@rochal, "this" is refer to the Image object, as most browser implementation. As far as I know, most browser(except opera) does not implement "this" right, and "this" in event handler will refer to a object, not the "this" where the function defined.
Dennis Cheung
@rochal, as I wrote in my answer, accessing "width" and "height" will trigger browser to reflow the page to give you the most up-to-date value. If you are looking for real rendered/computed style, you has to use getComputedStyle(), or currentStyle in IE.
Dennis Cheung
@DennisCheung, did he delete his original comment?
ArtBIT
+5  A: 

In other words you ask why browser A takes longer time than browser B to do exactly the same ting.

Well, they DO NOT do the same thing. You write in your script what you want to happen, and the browser of your choice tries it's best to makes that happen, but you have little or no control over how it does it.

The browsers are written by different teams with different philosophies of how to do things. Your 20-something-line-script could require 10000 cpu-cycles in browser A and 50000 in browser B depending on how the code of the browser is written and the inner workings of the browser.

To give a detailed answer why IE8 is slower compared to FF in this case one has to look under the hood whats going on. Since the source code for IE8 isn't publicly available we can't look there and I'm not aware if there is any documentation that is detailed enough to tell what's going on.

Instead I give you an example of how two different philosophies of how to do things greatly effects the time to produce the same end result. Note: This is an example and any resemblance with the real world is purely coincidental.

What to do:

  • get the dimensions of an image.

Team A:

  1. Load file from specified source
  2. decode image into memory
  3. return width and height of the image

Nothing wrong with that, is it? It does return the width and height of the image. Can team be do it better?

Team B:

  1. Load the first 1024 bytes of file into memory
  2. Detect image format
    • is it jpeg? get header FFC0, save width and height
    • is it png? locate header, save width and height
    • is it gif? locate header, save width and height
  3. return width and height of the image

Their code also returns the width and height of the image, but they do it in a different way that is several magnitudes times faster than the code that team A wrote: Only the beginning of the file is loaded into memory, and the dimensions is taken from the header without decoding the image. It saves bandwidth, memory and time.

So, which one is the best? The code from Team A or Team B? Think of that for a moment while both teams run their code against 100 000 images.... It might take a wh... oh, team B is already finished! They say that 10% of the images was smaller than 50x60 pixels, and that they couldn't open 3 of them. How about team A then? Looks like we have to wait a while... a cup of coffee maybe?

[10 minutes later]

I guess that you think team B wrote the best code, am I right or am I right?

Team A says 8% of the images was smaller than 50x60 pixels. Strange, that wasn't what team B said. Team A also says that they couldn't get the dimension of 20% of the images because those files where corrupted. That's something that team B didn't say anything about...

So which code did you think was the best?

I apologize for linguistic errors, English isn't my native language.

some