views:

195

answers:

1

I have a table that has some Unicode in it. I know that the Unicode data is fine as it comes out as JSON on our webserver just fine. But for some reason the CSV that I'm generating is ending up mangled. Here's our current code:

  my $csv = Text::CSV->new ({ eol => "\015\012" });
  open my $fh, '>:encoding(utf8)', 'Foo.csv';
  my $sth = $dbh->prepare("SELECT * FROM Foo");
  $sth->execute();
  my $i = 0;
  while (my $row = $sth->fetchrow_hashref) {
     $csv->print($fh, [keys %$row]) if $i == 0;
     $csv->print($fh, [values %$row]);
     $i++;
  }

Any ideas?

+3  A: 

Aside from the encoding issue, I don't think there's any guarantee that values will always give the fields in the same order. You might be getting a different hashref from fetchrow_hashref each time you call it. The solution to that is to use fetchrow_arrayref.

Text::CSV recommends Text::CSV::Encoded:

my $csv = Text::CSV::Encoded->new({ eol => "\015\012" });
open my $fh, '>:raw', 'Foo.csv';
my $sth = $dbh->prepare("SELECT * FROM Foo");
$sth->execute();
$csv->print($fh, $sth->{NAME_lc}); # or NAME or NAME_uc

while (my $row = $sth->fetchrow_arrayref) {
   $csv->print($fh, $row);
}

Or, if you don't want to have to install a new module:

use Encode 'find_encoding';
my $utf8 = find_encoding('utf8');

my $csv = Text::CSV->new({ binary => 1, eol => "\015\012" });
open my $fh, '>:raw', 'Foo.csv';
my $sth = $dbh->prepare("SELECT * FROM Foo");
$sth->execute();
# I'm assuming your field names are ASCII:
$csv->print($fh, $sth->{NAME_lc}); # or NAME or NAME_uc

while (my $row = $sth->fetchrow_arrayref) {
   $csv->print($fh, [ map { $utf8->encode($_) } @$row ]);
}
cjm