views:

1318

answers:

2

I am trying to load in my data migration a member database. Quite a few of the names have special characters such as "Ciarán". I've set up a simple example like this:

require 'rubygems'
require 'fastercsv'

FasterCSV.foreach("/Users/developer/Work/madmin/db/data/Members.csv") do |row|
  puts row.inspect
end

and I get the following:

/usr/local/lib/ruby/gems/1.8/gems/fastercsv-1.5.0/lib/faster_csv.rb:1616:in `shift': FasterCSV::MalformedCSVError (FasterCSV::MalformedCSVError)

when I hit the row with this name.

I have been googling character encoding and UTF-8, but have not yet found a solution. I'd like to keep the special characters but would rather not have to edit each member name that fails.

Many thanks, Brett

A: 

I've read elsewhere that this can be fixed by setting KCODE. For example:

$KCODE = "U"

Stick this at the top.

James Edward Gray has also said he's added encoding support to FasterCSV but it's in trunk only.

Peter Cooper
Looks like I actually mis-diagnosed - the file was not UTF-8, as I thought, but ISO-8859-1. I converted - but both these answers would have worked if I actually had UTF-i.
brett
+1  A: 

It works right off the bat for me, but if you need to change the encoding, you can pass an encoding option to FasterCSV. For example, to tell it to use UTF-8, you can do this:

require 'rubygems'
require 'fastercsv'

FasterCSV.foreach("some file.csv", :encoding => 'u') do |row|
  puts row.inspect
end

The encoding options are listed in the documentation for new.

Pesto