views:

95

answers:

1

I've copied some text nodes from the page (i.e., live DOM) and have stored them in an array. Is it possible to locate those exact nodes again in the live DOM? I would need to use plain JavaScript.

This is basically how my project works:

  • A configuration object specifies sections of the page (using CSS selectors) where there is some text content, and optionally subsections within that section (also using CSS selectors) where there is text content that should be ignored. The property names that I use, respectively, are select and exclude.
  • Using Sizzle (jQuery's selector engine), I generate an element collection for each of the select selectors, then clone it.
  • I then run the exclude selectors against the select collection, finding any elements that match, and remove them from the select collection.
  • Using the select collection with the removed exclude sections, I traverse it to build an array of only the text nodes.
  • I use this array of text nodes to do some word matching based on a supplied list of terms that may be in the original text content. To do this, I build an array of objects that include properties like the text node in which a term was found, and at what position/offset that term occurs within the text node's data.

Given the latter array, I need to be able to match the cloned text nodes in which matching terms were found to the original text nodes from which they were cloned. If I can do this, I can just iterate over my array of objects, first finding the text node that corresponds to the live DOM (from which it was originally cloned), and then linking the term at the position/offset that I have recorded in the object.

Hopefully this makes some sense - please let me know if not, and I can provide more details. This must be deterministic - i.e., I can't just search the live DOM for a text node that has the same data, as that may lead to false positives.

Again, I must use plain JavaScript here, not jQuery or any other libraries.

I appreciate any help!

+2  A: 

In general, you can't. Once you've cloned, the new clones have a different identity to the old nodes. There is no general-purpose way to tie a node back to the node it was cloned from.

If you know the topmost cloned ancestor node, and the node in the document that it was cloned from, you can naturally use the index of each ancestor in its parent childNode list to walk down from the topmost ancestors to a particular node... but only assuming neither DOM has been mutated since cloning.

Otherwise, you'd be left with some horrid hack like walking the entire DOM of the clone immediately after cloning, writing an expando property to each node to refer to the source node.

I'm not sure why you are cloning nodes at all. In the process you describe, surely you could just store the live document nodes in your lists, without any cloning?

bobince
Thanks, bobince! You have a good point - the reason that I'm cloning the node trees is so that I can remove the `exclude` nodes without affecting the live page, and then keep a reference to the modified tree for easy traversal later on. I suppose I could just as well somehow mark those nodes as "invalid," and then check for that as I'm traversing the tree and accumulating text nodes. What would you suggest as a best practice for that - is there an ideal way that I can unobtrusively attach some data to the live nodes, so that a basic node traversal function can check for that?
Bungle
You can set non-standard properties, eg. `element._my_thing= false`. This is called an expando property and it's slightly frowned upon as no spec endorses it, but all browsers will let you do it. (There's also DOM 3 Core `setUserData`, but it's less well-supported.) It might be better just to keep the nodes in the `exclude` list, and check each node at traversal time to see whether they're in the list or not. (`exclude.indexOf(node)!==-1`, except that not all browsers support `indexOf` yet... for fallback see: https://developer.mozilla.org/En/Core_JavaScript_1.5_Reference/Objects/Array/IndexOf
bobince
Awesome, thanks. I actually stumbled upon the expando property approach just now, and it appears to have worked fine. I do like your latter suggestion, though, so I'll see if I can get that working. Thanks again!
Bungle