views:

118

answers:

2

Hi, I'm trying to parse a CSV file and automatically create a table for it using SQL commands. The first line in the CSV gives the column headers. But I need to infer the column type for each one.

Is there any function in Ruby that would find the type of the content in each field. For example, the CSV line:

"12012", "Test", "1233.22", "12:21:22", "10/10/2009"

should produce the types like

['integer', 'string', 'float', 'time', 'date']

Thanks!

+1  A: 

This might get you started

I don't have a complete solution, but this may help get you started. You can go from an example record to an array of Class objects to a string representation automatically, at least for some types, and then translate the strings...

$ irb
>> t = { "String" => "string", "Fixnum" => "integer", "Float" => "float" }
=> {"Float"=>"float", "Fixnum"=>"integer", "String"=>"string"}
>> ["xyz", 123, 123.455].map { |x| t[x.class.to_s] }
=> ["string", "integer", "float"]

You could map the classes directly, actually:

$ irb
>> t = { String => "string", Fixnum => "integer", Float => "float" }
=> {String=>"string", Float=>"float", Fixnum=>"integer"}
>> ["xyz", 123, 123.455].map { |x| t[x.class] }
=> ["string", "integer", "float"]
DigitalRoss
Hi digitalross. Thanks for the suggestion. But I would need to know whether the string "123" is a number or not. In the example you provided it was the type-name of the number 123 that got mapped.ie, would it be possible to get["xyz","123","123.455"] mapped into ["string", "integer", "float"] ?
Jasim
In your question you didn't have quotes around the integers and the floats, only the strings and dates. If in fact everything is quoted you really should edit your question's example to reflect that. :-) I was designing to the wrong test case! :-) Perhaps you could show a piece of the real input?
DigitalRoss
OOPS! It was a bad oversight. I'm sorry.. the question has been corrected. And Thanks for your time.
Jasim
+1  A: 
require 'time'

def to_something(str)
  if (something = Integer(str) rescue Float(str) rescue nil)
    something
  elsif (something = Time.parse(str)) == Time.now
    # Time.parse does not raise an error for invalid input
    str
  else 
    something
  end
end

%w{12012 1233.22 12:21:22 10/10/2009 Test}.each do |str|
  something = to_something(str)
  p [str, something, something.class]
end

Results in

["12012", 12012, Fixnum]
["1233.22", 1233.22, Float]
["12:21:22", Sat Sep 12 12:21:22 -0400 2009, Time]
["10/10/2009", Sat Oct 10 00:00:00 -0400 2009, Time]
["Test", "Test", String]
glenn jackman
That was exactly the code I needed. Great! Thank you glenn.
Jasim