I'm trying to use Hpricot to get the value within a span with a class name I don't know. I know that it follows the pattern "foo_[several digits]_bar".
Right now, I'm getting the entire containing element as a string and using a regex to parse the string for the tag. That solution works, but it seems really ugly.
doc = Hpricot(open("http://scrape.example.com/search?q=#{ticker_symbol}"))
elements = doc.search("//span[@class='pr']").inner_html
string = ""
elements.each do |attr|
if(attr =~ /foo_\d+_bar/)
string = attr
end
end
# get rid of the span tags, just get the value
string.sub!(/<\/span>/, "")
string.sub!(/<span.+>/, "")
return string
It seem like there should be a better way to do that. I'd like to do something like:
elements = doc.search("//span[@class='" + /foo_\d+_bar/ + "']").inner_html
But that doesn't run. Is there a way to search with a regular expression?