views:

162

answers:

3

I've been looking for a straight answer for this (I can think of lots of possiblities, but I'd like to know the true reason):

jQuery provides a .data() method for associating data with DOM Element objects. What makes this necessary? Is there a problem adding properties (or methods) directly to DOM Element Objects? What is it?

+1  A: 

I think you can add all the properties you want, as long as you only have to use them yourself and the property is not a method or some object containing methods. What's wrong with that is that methods can create memory leaks in browsers. Especially when you use closures in such methods, the browser may not be able to complete garbage cleaning which causing scattered peaces of memory to stay occupied.

This link explains it nicely.

here you'll find a description of several common memory leak patterns

KooiInc
+2  A: 

It has to do with the fact that DOM in IE is not managed by JScript, which makes it completely different environment to access. This leads to the memory leaks http://www.crockford.com/javascript/memory/leak.html. Another reason is that, when people use innerHTML to copy nodes, all those added properties are not transfered.

nemisj
+8  A: 

Is there a problem adding properties (or methods) directly to DOM Element Objects?

Potentially.

There is no web standard that says you can add arbitrary properties to DOM nodes. They are ‘host objects’ with browser-specific implementations, not ‘native JavaScript objects’ which according to ECMA-262 you can do what you like with. Other host objects will not allow you to add arbitrary properties.

In reality since the earliest browsers did allow you to do it, it's a de facto standard that you can anyway... unless you deliberately tell IE to disallow it by setting document.expando= false. You probably wouldn't do that yourself, but if you're writing a script to be deployed elsewhere it might concern you.

There is a practical problem with arbitrary-properties in that you don't really know that the arbitrary name you have chosen doesn't have an existing meaning in some browser you haven't tested yet, or in a future version of a browser or standard that doesn't exist yet. Add a property element.sausage= true, and you can't be sure that no browser anywhere in space and time will use that as a signal to engage the exciting DOM Sausage Make The Browser Crash feature. So if you do add an arbitrary property, make sure to give it an unlikely name, for example element._mylibraryname_sausage= true. This also helps prevent namespace conflicts with other script components that might add arbitrary properties.

There is a further problem in IE in that properties you add are incorrectly treated as attributes. If you serialise the element with innerHTML you'll get an unexpected attribute in the output, eg. <p _mylibraryname_sausage="true">. Should you then assign that HTML string to another element, you'll get a property in the new element, potentially confusing your script.

(Note this only happens for properties whose values are simple types; Objects, Arrays and Functions do not show up in serialised HTML. I wish jQuery knew about this, because the way it works around it to implement the data method is absolutely terrible, results in bugs, and slows down many simple DOM operations.)

bobince
+1, although I wish you had mentioned HTML5 `data-*` storage (http://dev.w3.org/html5/spec/Overview.html#attr-data). WRT to jQuery, `data` storage is implemented as an expando attribute storing 1 numeric value that maps to a key in `jQuery.cache`. The theory is that `jQuery.cache` can be freely reclaimed by the browser on unload because it is unattached to any DOM element, and conversely the DOM element can be freely garbage collected because it does not store any self-referencing object/function/etc on itself. I don't see the problem there?
Crescent Fresh
The problem is (1) jQuery then has to remove the `jQuery(number)` attribute from the HTML any time you do an operation on the HTML (such as `html()` or `clone()`). It does this by (uuuurrrrrgghhh) sticking the string through a regex. So if you clone a paragraph that contains the text `jQuery0="foo"`, the text will disappear. (2) now it has to keep track of IDs, DOM operations that should be quick and easy aren't. For example try to `remove()` an element and watch as jQuery recurses into every descendant node looking for data ids to throw away. Oh dear.
bobince
+1. Great answer as usual.
Tim Down
@bobince: re (1): yuck indeed. re (2): yep, jQuery has to delete associated data it stores when removing a node (and do the same for all descendant nodes). However the same would have to be done if an implementation stored the data directly off the node as an object (eg `delete node.customDataObject` + descend into all descendants). That is, if an implementation wanted to do the right thing wrt to memory.
Crescent Fresh
True, however most JS objects don't leak memory if you leave them attached to DOM nodes. The main time memory leaks is when the object is a function closed over a circular reference back to the node, in IE (up to IE7 and with worse long-term effect in IE6). jQuery, in its attempt to smooth over the browser bugs, ends up inflicting the lowest-common-denominator performance on everyone. (And adding its own smaller memory leak in the process, judging by question 1462649.)
bobince
Ideally, the ‘cache’ workaround should only be used for That Bad Browser, and even then storing a property as a structured object (eg. `node._jquery_id= [1]`) so that it never appeared in the `innerHTML` would be better than the current ghastly business with the regex.
bobince
@bobince: Your `_jquery_id = [num]` example reminded me that the prototype lib uses the same technique. Some very nice points, thank you.
Crescent Fresh