views:

1113

answers:

4

As a side result of testing some code I wrote a small function to compare the speed of using the array.push method vs direct addressing (array[n] = value). To my surprise the push method often showed to be faster especially in Firefox and sometimes in Chrome. Just out of curiosity: anyone has an explanation for it? You can find the test @this page (click 'Array methods comparison')

A: 

Push adds it to the end, while array[n] has to go through the array to find the right spot. Probably depends on browser and its way to handle arrays.

Stiropor
In case of the test n is known (it's equivalent to [array].length-1), so there's no searching going on.
KooiInc
If you're looking for n-th element it needs to find a pointer to that spot in array to fill in the value.
Stiropor
In the case of the test, n is known. However, the Javascript libraries were written completely ignorant of your test and might still have [] search the array for the right spot even though you know perfectly well where it is. Think of a linked list with a tail pointer.
Chuck
+3  A: 

push() is a special case of the more general [[Put]] and therefore can be further optimized:

When calling [[Put]] on an array object, the argument has to be converted to an unsigned integer first because all property names - including array indices - are strings. Then it has to be compared to the length property of the array in order to determine whether or not the length has to be increased. When pushing, no such conversion or comparison has to take place: Just use the current length as array index and increase it.

Of course there are other things which will affect the runtime, eg calling push() should be slower than calling [[Put]] via [] because the prototype chain has to be checked for the former.


As olliej pointed out: actual ECMAScript implementations will optimize the conversion away, ie for numeric property names, no conversion from string to uint is done but just a simple type check. The basic assumption should still hold, though its impact will be less than I originally assumed.

Christoph
All JS engines actually optimise [[Put]] for integers on the assumption that if you're using an integer it's probably a type that has a special handler for Integer property names -- eg. Arrays, Strings, as well as DOM types (NodeLists, CanvasPixelArray, etc)
olliej
Err, completing the last comment - they assume Integer first, then generic object fallback will convert the Integer to a string and try again with the string representation.
olliej
+21  A: 

All sorts of factors come into play, most JS implementations use a flat array that converts to sparse storage if it becomes necessary later on.

Basically the decision to become sparse is a heuristic based on what elements are being set, and how much space would be wasted in order to remain flat.

In your case you are setting the last element first, which means the JS engine will see an array that needs to have a length of n but only a single element. If n is large enough this will immediately make the array a sparse array -- in most engines this means that all subsequent insertions will take the slow sparse array case.

You should add an additional test in which you fill the array from index 0 to index n-1 -- it should be much, much faster.

In response to @Christoph and out of a desire to procrastinate, here's a description of how arrays are (generally) implemented in JS -- specifics vary from JS engine to JS engine but the general principle is the same.

All JS Objects (so not strings, numbers, true, false, undefined, or null) inherit from a base object type -- the exact implementation varies, it could be C++ inheritance, or manually in C (there are benefits to doing it in either way) -- the base Object type defines the default property access methods, eg.

interface Object {
    put(propertyName, value)
    get(propertyName)
private:
    map properties; // a map (tree, hash table, whatever) from propertyName to value
}

This Object type handles all the standard property access logic, the prototype chain, etc. Then the Array implementation becomes

interface Array : Object {
    override put(propertyName, value)
    override get(propertyName)
private:
    map sparseStorage; // a map between integer indices and values
    value[] flatStorage; // basically a native array of values with a 1:1
                         // correspondance between JS index and storage index
    value length; // The `length` of the js array
}

Now when you create an Array in JS the engine creates something akin to the above data structure. When you insert an object into the Array instance the Array's put method checks to see if the property name is an integer between 0 and 2^32-1 (or possibly 2^31-1, i forget exactly). If it is not, then the put method is forwarded to the base Object implementation, and the standard [[Put]] logic is done. Otherwise the value is placed into the Array's own storage, if the data is sufficiently compact then the engine will use the flat array storage, in which case insertion (and retrieval) is just a standard array indexing operation, otherwise the engine will convert the array to sparse storage, and put/get use a map to get from propertyName to value location.

I'm honestly not sure if any JS engine currently converts from sparse to flat storage after that conversion occurs.

Anyhoo, that's a fairly high level overview of what happens and leaves out a number of the more icky details, but that's the general implementation pattern. The specifics of how the additional storage, and how put/get are dispatched differs from engine to engine -- but this is the clearest i can really describe the design/implementation.

A minor addition point, while the ES spec refers to propertyName as a string JS engines tend to specialise on integer lookups as well, so someObject[someInteger] will not convert the integer to a string if you're looking at an object that has integer properties eg. Array, String, and DOM types (NodeLists, etc).

olliej
@olliej: "most JS implementations use a flat array that converts to sparse storage if it becomes necessary later on" - interesting. So array objects have two kinds of storage: one for regular properties, one for array entries?
Christoph
@Christoph: Yup -- I could go into great detail if you like, but it will be biased towards the JavaScriptCore/Nitro implementation -- the general model is the same in SpiderMonkey, V8 and KJS, but i'm don't know their *exact* implementation details
olliej
@olliej: just checked the SpiderMonkey source: the JSObject struct contains has a `dslot` member (d for dynamic) which will hold an actual array as long as the JS array is dense; I didn't check what will happen for sparse arrays or when using non-array-index property names
Christoph
@olliej: thanks, it's making sense. I added a [0..n] test to the page, it is faster and I understand why. Compared to push [0..n] is faster in all browsers.
KooiInc
olliej
addendum: adding values directly and in an ascending [0..n] way actually chokes (my) IE7 here.
KooiInc
+1  A: 

These are the result I get with your test

on Safari:

  • Array.push(n) 1.000.000 values: 0.124 sec
  • Array[n .. 0] = value (descending) 1.000.000 values: 3.697 sec
  • Array[0 .. n] = value (ascending) 1.000.000 values: 0.073 sec

on FireFox:

  • Array.push(n) 1.000.000 values: 0.075 sec
  • Array[n .. 0] = value (descending) 1.000.000 values: 1.193 sec
  • Array[0 .. n] = value (ascending) 1.000.000 values: 0.055 sec

on IE7:

  • Array.push(n) 1.000.000 values: 2.828 sec
  • Array[n .. 0] = value (descending) 1.000.000 values: 1.141 sec
  • Array[0 .. n] = value (ascending) 1.000.000 values: 7.984 sec

According to your test the push method seems to be better on IE7 (huge difference), and since on the other browsers the difference is small, it seems to be the push method really the best way to add element to an array.

But I created another simple test script to check what method is fast to append values to an array, the results really surprised me, using Array.length seems to be much faster compared to using Array.push, so I really don't know what to say or think anymore, I'm clueless.

BTW: on my IE7 your script stops and browsers asks me if I want to let it go on (you know the typical IE message that says: "Stop runnign this script? ...") I would recoomend to reduce a little the loops.

Marco Demajo