could somebody explain the code below for me? Only the lines where is #??? I asked and got an answer It does 100% what I asked for. Now I need to add some code and I do not know where. I tried and it did not work. I used custom class in ruby twice so far otherwise I used to use arrays.
Is it actually custom class used below? :-)
I need to add code that clicks url of the thread and extracts some details from the post itself. I know how to do it but where can I add the code? I tried with error or wrong results.
I am scraping web forum and I need to go through few pages so I will call the same code on every single page. Is this code structure below ok in that terms?
#!/usr/bin/ruby1.8
require 'nokogiri'
require 'pp'
html = <<-EOS
(The HTML from the question goes here)
EOS
doc = Nokogiri::HTML(html)
rows = doc.xpath('//table/tbody[@id="threadbits_forum_251"]/tr')
details = rows.collect do |row|
detail = {} #??? - is that definition of custom class
[ #???
[:title, 'td[3]/div[1]/a/text()'], #???
[:name, 'td[3]/div[2]/span/a/text()'],
[:date, 'td[4]/text()'],
[:time, 'td[4]/span/text()'],
[:number, 'td[5]/a/text()'],
[:views, 'td[6]/text()'],
].collect do |name, xpath| #??? - filling with data?
#??? - where did we get name and xpath from?
detail[name] = row.at_xpath(xpath).to_s.strip #??? here we fill all date,time etc attributes??
end
detail #??? this is really tricky for me. Why is it here?
end
pp details
# => [{:time=>"23:35",
# => :title=>"Vb4 Gold Released",
# => :number=>"24",
# => :date=>"06 Jan 2010",
# => :views=>"1,320",
# => :name=>"Paul M"}]