ansaurus

Question

Answer 1

A:

One could modify the incoming html before parsing.

html = open("http://scrape.example.com/search?q=#{ticker_symbol}").string
html.gsub!(/class="(foo_\d+_bar)"/){ |s| "class=\"foo_bar #{$1}\"" }
doc = Hpricot(html)

After that you can identify the elements using the foo_bar class. This is far from elegant or general but could prove to be more efficient.

anshul 2009-12-29 09:05:16

Thanks for the suggestion. That would have worked, except it returns a string. I'd rather get an hpricot Element object back.

AaronM 2010-01-03 23:33:27

Answer 2

+2 A:

This should do:

doc.search("span[@class^='foo'][@class$='bar']")

Nakul 2009-12-30 11:34:54

This looks like what I want. I'll give it a try and see how that goes.

AaronM 2010-01-02 22:11:01

Worked perfectly! That's exactly what I wanted.

AaronM 2010-01-03 23:31:56

ansaurus

tags:

views:

answers:

Searching Hpricot with Regex

related questions