views:

135

answers:

1

I'm using pQuery (a Perl port of jQuery) to select elements and retrieve text from a HTML-document.

Consider the following markup:

<x>
   <y>code1</y>
   <z>stuff</z>
   <y>code2</y>
   <z>foobar</z>
</x>

And the following pQuery code:

my $target_value = pQuery($markup)->find($pquery_selector)->text;

I'm trying to formulate $pquery_selector so that it matches <z>foobar</z> in the markup above using the following rule: find the z-element that follows after a y-element which has a body containing "code2". While this is possible using jQuery I'm not sure that the pQuery syntax is powerful enough to handle such an expression.

Is this type of selection possible using the pQuery syntax?

+1  A: 

In jQuery it might be possible to write a selector like 'y:contains(code2)+z'. However, pQuery is still unfinished (as of version 0.07), and a selector like x+z just gives an error demonstrating that the module developer hasn't gotten around to translating that part of the jQuery code.

Since pQuery hasn't been touched since 2008, I'd recommend either fixing it yourself (the code is on cpan and github), or using a more mature module like HTML::TreeBuilder::XPath (which does require learning XPath syntax, but actually works for non-trivial things).

The XPath equivalent of the above jQuery selector would be '//y[contains(text(), 'code2')]/following-sibling::z'