tags:

views:

132

answers:

1

I have a document containing ahref links I want to extract. The link I want can be identified by part of the url they link to. There are other links that are similar which I want to discard.

The urls of the links I want are of the format

http://www.xxxxxxxxxxxxxxxxxxx.com/index.php?showtopic=44&hl=

I want to search for links containing the h1=. Is this possible?

+1  A: 

You can just do a normal find on the document's set of A-tags.

document.search('a').find {|link| link['href'].include? 'h1='}
Chuck