I know that there are dozens of ways to select the first child element in Nokogiri, but which is the cheapest? I can't get around using Node#children, which sounds awfully expensive. Say that there are 10000 child nodes, and I don't want to touch the 9999 others...
You can try it yourself and benchmark the result.
I created a quick benchmark: http://gist.github.com/283825
$ ruby test.rb
Rehearsal ---------------------------------------------------
xpath/first() 3.290000 0.030000 3.320000 ( 3.321197)
xpath.first 3.360000 0.010000 3.370000 ( 3.381171)
at 4.540000 0.020000 4.560000 ( 4.564249)
at_xpath 3.420000 0.010000 3.430000 ( 3.430933)
children.second 0.220000 0.010000 0.230000 ( 0.233090)
----------------------------------------- total: 14.910000sec
user system total real
xpath/first() 3.280000 0.000000 3.280000 ( 3.288647)
xpath.first 3.350000 0.020000 3.370000 ( 3.374778)
at 4.530000 0.040000 4.570000 ( 4.580512)
at_xpath 3.410000 0.010000 3.420000 ( 3.421551)
children.second 0.220000 0.010000 0.230000 ( 0.226846)
From my tests, children
appears to be the fastest method.
An approach that neither uses XPath nor results in parsing the whole parent is to use both Node#child(), Node#next_sibling() and Node#element?()
Something like this...
def first(node)
element = node.child
while element
if element.element?
return element
else
element = element.next
end
end
nil
end
Node#child is the fastest way to get the first child element.
However, if the node you're looking for is NOT the first (e.g., the 99th), then there is no faster way to select that node than to call #children and index into it.
You are correct in stating that it's expensive to build a NodeSet for all children if you only want the first one.
One limiting factor is that libxml2 (the XML library underlying Nokogiri) stores a node's children as a linked list. So you'll need to traverse the list (O(n)) to select the desired child node.
It would be feasible to write a method to simply return the nth child, without instantiating a NodeSet or even ruby objects for all the other children. My advice would be to open a feature request, at http://github.com/tenderlove/nokogiri/issues or send an email to the nokogiri mailing list.