tags:

views:

1695

answers:

4

There is a website which gives me the information of pin codes of a particular state for example indian postal website, gives the details when I select the state in the drop down.

I need to write the script in ruby which would create the CSV file with all the data for a particular state.

Today is my first day on ruby and not sure how to approach this. Any help in the right direction would be appreciated.

Thanks

+1  A: 

You should be interested in the FasterCSV gem

http://fastercsv.rubyforge.org/

gem install fastercsv

And then, something like that:

require 'fastercsv'
FasterCSV.open("temp.csv", "w") do |csv|
  csv << ["line1row1", "line1row2"]
  csv << ["line2row1", "line2row2"]
  # ...
end
Gaetan Dubar
+1  A: 

You need to clarify / give more information.

  • Are you trying to screen-scrape that web site or are you trying to produce something like it?
  • If the former, you'll need to use the Net::HTTP and probably some regular expressions
  • If the latter, where is the data coming from (e.g. in what form do you get it)?

In any case, ruby is a good language to putter around with. Try irb for interactive testing of snippets. Generating CSV can be very easy, especially if you don't have any complex string fields (e.g., things that might have embedded quotes).

To screen scrape:

  • grab the page with Net::HTTP
  • grep through the body using regular expressions to pick out the values you want
  • make it into CSV either with string interpolation or using the package mentioned in the other answer
MarkusQ
Thanks for the reply and I apologize on the less clarity of my initial question (but I am still very new to ruby)You are correct, I am trying to screen-scrape that website. is it possible for you to point me to some good examples of Net::HTTP?Thanks for your help.
MOZILLA
+2  A: 

You should be able to accomplish this using the following ruby gems:

You'll find documentation and examples for each gem on the urls mentioned above and on Google. Besides that, a book on Ruby might help enhancing your Ruby skills.

Javier
A: 

It just so happens that I recently finished a ruby program called bankjob that does just this thing, only for an online bank website.

It's completely open source and documented, so go check it out at bankjob.rubyforge.org.

Bankjob uses Mechanize, Hpricot (as suggested in other answers) to scrape a website with a table in it and produce CSV output (it also produces OFX which is irrelevant to your need - since it's bank-statement data, but the CSV should work for any kind of data).

You should at least be able to start with Bankjob and cut out what you don't need to get your postal info, but in fact, you may be able to use it as is, creating a specific scraper (which is documented) to get your data and dumping to to csv with the --csv option.

good luck

Rhubarb