ansaurus

Question

Scala: Given a scala.xml.Node, what's the most efficient way of getting the second (or n-th) child element?

Answer 1

+1 A:

What I have so far is:

node.child.filter(_.isInstanceOf[scala.xml.Elem])(1)

eed3si9n 2010-02-18 04:38:07

Answer 2

+1 A:

Get the second element named "foo", or None if not found:

(xml \ "foo").drop(1).headOption

Or, more efficiently in case of large XML structures:

xml.child.toStream.partialMap { 
   case e: xml.Elem if e.label == "foo" => e
}.drop(1).headOption

(This is with Scala 2.8)

UPDATE

To get the second, regardless of name:

 (xml \ "_") drop(1) headOption

retronym 2010-02-18 06:54:39

Thanks for your answer. Just for clarification, as @huynhjl wrote, I am interested in the second child element, not the second instance of foo.

eed3si9n 2010-02-18 14:15:19

Answer 3

+3 A:

I like retronym's drop(n).headOption pattern as it accounts for when you have less children than n. But I think you meant the second child node (excluding text nodes), not the second instance of the <foo> tag. With that in mind, combining with your answer or using partialMap:

node.child.partialMap{case x:scala.xml.Elem => x}.drop(n).headOption

node.child.filter(_.isInstanceOf[scala.xml.Elem]).drop(n).headOption

This has to assume that you won't want to extract text in:

val node = <something><foo/>text</something>

Efficiency wise, the only other thing I could think of is to make filter lazy if you wanted to retrieve the second child when there are a large number of children. I think this may be achieved by running filter on node.child.iterator instead.

Edit: Changed toIterable to iterator. good point, calling drop(n) on an ArrayBuffer will cause additional allocations, also how many is hard to tell, since it seems drop is overridden in IndexSeqLike. But using the iterator would address that too. So for large number of children:

node.child.iterator.filter(_.isInstanceOf[scala.xml.Elem]).drop(n).next

If you want to have it be safe, you may need to define a function to check for hasNext.

All of this is tested only in 2.8.

huynhjl 2010-02-18 08:02:02

So drop(n).headOption buys me safety, but not efficiency? Since child returns ArrayBuffer, making it iterable avoids filtering cost only, right?

eed3si9n 2010-02-18 11:56:52

ansaurus

tags:

views:

answers:

Scala: Given a scala.xml.Node, what's the most efficient way of getting the second (or n-th) child element?

related questions