tags:

views:

215

answers:

4

hi all

I am having an array like "author","post title","date","time","post category", etc etc

I scrape the details from a forum and I want to

  • save the data using ruby
  • update the data using ruby
  • update the data using text editor or I was thinking of one of OpenOffice programs? Calc would be the best.

I guess to have some kind of SQL database would be a solution but I need quick solution for that (somthing that I can do by myself :-)

any suggestions?

Thank you

+1  A: 

You could serialize it to json and save it to a file. This would allow you to edit it using a simple text editor.

if you want to edit it in something like calc, you could consider generating a CSV (comma separated values) file and import it.

ryeguy
A: 

If I understand correctly, you have a two-dimensional array. You could output it in csv format like so:

array.each do |row|
    puts row.join(",")
end

Then you import it with Calc to edit it or just use a text editor.

If your data might contain commas, you should have a look at the csv module instead: http://ruby-doc.org/stdlib/libdoc/csv/rdoc/index.html

Kim
@Kim, that is exactly what I was thinking about.Join and when reading split but I did not know how to handle , In case my data contain , I thought it wouldn't work so I posted my question here. I hope that 'csv' library can handle that.
Radek
+6  A: 

YAML is your friend here.

require "yaml"
yaml= ["author","post title","date","time","post category"].to_yaml
File.open("filename", "w") do |f|
  f.write(yaml)
end

this will give you

---
- author
- post title
- date
- time
- post category

vice versa you get

require "yaml"
YAML.load(File.read("filename")) # => ["author","post title","date","time","post category"]

Yaml is easily human readable, so you can edit it with any text editor (not word proccessor like ooffice). You can not only searialize array's and strings. Yaml works out of the box for most ruby objects, even for objects of user defined classes. This is a good itrodution into the yaml syntax: http://yaml.kwiki.org/?YamlInFiveMinutes.

johannes
@johannes thank you for introducing Yalm to me. I would like XML format better. I think I like csv better as solution for my thing
Radek
+2  A: 

If you want to use a spreadsheet, csv is the way to go. You can use the stdlib csv api like:

require 'csv'

my2DArray = [[1,2],["foo","bar"]]

File.open('data.csv', 'w') do |outfile|
  CSV::Writer.generate(outfile) do |csv|
    my2DArray.each do |row|
      csv << row
    end
  end
end

You can then open the resulting file in calc or in most statistics applications.

The same API can be used to re-import the result in ruby if you need.

paradigmatic
@paradigmatic hi and thank you.I was thinking about csv but can it handle , in a field? I think that my data can contain ,
Radek
You can provide the separator you want as a second parameter to the `CSV::Writer.generate` method. You can then specify the same separator when you open the csv with calc.
paradigmatic
@paradigmatic yes I was thinking about that too. But what char would I use? What if it is used in the post title? can 'csv' library handle "this is post with,comma" as one field?
Radek
There is no universal solution I think. First analyze your data and look if there are tabulations ("\t") or rare characters like ('§') which make good candiates for separators. Another solution is to use the standard comma, but to wrap entries between double quotes. That is a legal csv: `"foo,bar", "baz"` with to items.
paradigmatic
@paradigmatic: I will use FasterCSV.Thank you for the hint.
Radek