views:

99

answers:

1

could somebody explain the code below for me? Only the lines where is #??? I asked and got an answer It does 100% what I asked for. Now I need to add some code and I do not know where. I tried and it did not work. I used custom class in ruby twice so far otherwise I used to use arrays.

Is it actually custom class used below? :-)

I need to add code that clicks url of the thread and extracts some details from the post itself. I know how to do it but where can I add the code? I tried with error or wrong results.

I am scraping web forum and I need to go through few pages so I will call the same code on every single page. Is this code structure below ok in that terms?

#!/usr/bin/ruby1.8
require 'nokogiri'
require 'pp'

html = <<-EOS
  (The HTML from the question goes here)
EOS

doc = Nokogiri::HTML(html)
rows = doc.xpath('//table/tbody[@id="threadbits_forum_251"]/tr')
details = rows.collect do |row|
  detail = {}                               #??? - is that definition of custom class
  [                                         #???
    [:title, 'td[3]/div[1]/a/text()'],      #???
    [:name, 'td[3]/div[2]/span/a/text()'],
    [:date, 'td[4]/text()'],
    [:time, 'td[4]/span/text()'],
    [:number, 'td[5]/a/text()'],
    [:views, 'td[6]/text()'],
  ].collect do |name, xpath|                #??? - filling with data?
                                          #??? - where did we get name and xpath from?
    detail[name] = row.at_xpath(xpath).to_s.strip #??? here we fill all date,time etc attributes??
  end
  detail                          #??? this is really tricky for me. Why is it here?
end
pp details

# => [{:time=>"23:35",
# =>   :title=>"Vb4 Gold Released",
# =>   :number=>"24",
# =>   :date=>"06 Jan 2010",
# =>   :views=>"1,320",
# =>   :name=>"Paul M"}]
+6  A: 

No offense is intended by this, but you don't appear to have any knowledge of Ruby. Your comments indicate you have no knowledge of methods, Hashes, Arrays, blocks, or iterators. Please visit the the Ruby website and go through some of their tutorials and read the documentation.

I'd also like to suggest purchasing Programming Ruby: The Pragmatic Programmer's Guide for a great tutorial and reference.

detail = {}          #??? - is that definition of custom class

See Hash

[                                         #???
  [:title, 'td[3]/div[1]/a/text()'],      #???

See Array

].collect do |name, xpath| #??? - filling with data?
                           #??? - where did we get name and xpath from?

See Enumerable

detail   #??? this is really tricky for me. Why is it here?

See blocks and Enumerable#collect

Answering your question in any further depth is beyond the scope of StackOverflow.

hobodave
Should be a comment on the question.
Yar
There, updated the question to actually provide "answers" as much as it pains me.
hobodave
People answer questions that require little programs to be written in other languages (e.g., PHP), even when it's clear that the questioner doesn't speak the language. For some reason in Ruby you have to go read the documentation... why is that?
Yar
@hobodave thank you for your comments and the links provided.
Radek
It has nothing to do with Ruby. SO isn't meant to give tutorials and how-tos on _several major_ __basics__ of a language. That's what documentation is for. Teach a man to fish ...
hobodave
You're right, hobodave. Maybe we can have a close-question tag for that, "asking for fish, not how to fish."
Yar
@yar: lol, I'm not sure if you are being facetious, but I like it :)
hobodave
@hobodave @yar I like and appreciate answers from both of you. I do know that I need to study on that subject. I usually study on the fly what I need to know more about at the moment when it pops up. I read 'Humble Little Ruby Book' 2 years ago and I need to read something again. I like yar's answer better and I still need to self-study how to "do fishing".
Radek
@hobodave thank you for editing your answer so many times and making it so perfect
Radek
@yar: why did you delete your answer? I did like it and wanted to read it again ...
Radek
@Radek, I will put the answer back, but if/when a recalculation happens on SO, it pays not to have loser answers. But I'll put it back for you, since our goal here is learning and helping people learn.
Yar
@yar: you can email it to me :-) hobodave improved his answer the way that it almost cover 100% what you wrote but still your answer got something valuable for me
Radek