Filtering in E4X

views:

answers:

+1 Q:

This is just a simple question. I'm currently using Mozilla's Rhino to develop a little webapp. As one step, I need to get a webpage and filter all of it's nodes. For doing that, I use E4X. I thought I could do this like that:

var pnodes = doc..*(p);

But that produces an error. How is it done right?

(BTW: this is just a step for increasing performance. The code already does well, it's just a bit slow.)

+2 A:

You should be able to use the following:

doc..*.(name() == "p")

Note that this there is a bug in the Rhino and SpiderMonkey implementations where the filter expression name() == "p" is not correctly scoped to the current node, so none of the XML or XMLList methods are defined.

Another workable solution is to lookup all p nodes in the document and accumulate the parent of each in an array.

var elements = [];

for each (var p in doc..p) {
    var parent = p.parent();
    if(elements.indexOf(parent) === -1)
        elements.push(parent);
}

Anurag 2010-05-25 18:08:27

Yes, efficiency is my problem. My current resolution is checking every node for p-nodes before processing it, this takes some time, for a normal page around 0.5s. I guess that just collecting nodes efficiently might cut this time dramatically.I saw people filtering nodes in the way I tried it before, I just can't figure out how to do this.

FB55 2010-05-25 19:18:44

The above version natively filters all `p` nodes, and collects the parent of each into an array rather than checking if each node is a `p`. You could use the parent node as a key in an object to make lookups `O(1)` instead of using `indexOf` to check if node already exists in Array.

Anurag 2010-05-25 19:51:46

ansaurus

tags:

views:

answers:

Filtering in E4X

related questions