views:

236

answers:

1

Consider the following 2 by 2 array:

x = {{"a b c", "1,2,3"}, {"i \"comma-heart\" you", "i \",heart\" u, too"}}

If we Export that to CSV and then Import it again we don't get the same thing back:

Import[Export["tmp.csv", d]]

Looking at tmp.csv it's clear that the Export didn't work, since the quotes are not escaped properly.

According to the RFC which I presume is summarized correctly on Wikipedia's entry on CSV, the right way to export the above array is as follows:

a b c, "1,2,3"
"i ""heart"" you", "i "",heart"" u, too"

Importing the above does not yield the original array either. So Import is broken as well.

I've reported these bugs to [email protected] but I'm wondering if others have workarounds in the meantime.

One workaround is to just use TSV instead of CSV. I tested the above with TSV and it seems to work (even with tabs embedded in the entries of the array).

+2  A: 

Instead of TSV, another workaround is to use a different delimiter:

In[26]:= str = ExportString[x, "CSV", "TextDelimiters"->"'"];
Out[26]= "'a b c','1,2,3'
'i \"comma-heart\" you','i \",heart\" u, too'"

In[27]:= y = ImportString[str, "CSV", "TextDelimiters"->"'"]
Out[27]= {{"a b c", "1,2,3"}, {"i \"comma-heart\" you", "i \",heart\" u, too"}}

In[28]:= x == y
Out[28]= True

Note that Import/Export and ImportString/ExportString take the same options, the latter functions just read/write strings instead of files.

You could also use one of the other tabular/scientific data formats that Mathematica supports, like XLS, ODS, HDF, HDF5, CDF, FITS, etc.

You might also find some of them faster since some of them are binary and there is thus no textual parsing to be done. It all depends on your application and what the file is used for outside of Mathematica.

Michael Pilat
Good idea. Though I imagine we could concoct an example with both escaped single and double quotes that would make your version break the same way as mine.
dreeves
Of course; such is the nature of workarounds =) You could also use one of the other tabular/scientific data formats that Mathematica supports, like XLS, ODS, HDF, HDF5, CDF, FITS, etc. You might also find some of them faster since some of them are binary and there is thus no textual parsing to be done. It all depends on your application and what the file is used for outside of Mathematica.
Michael Pilat
Thanks Michael. Good thoughts; you should make that part of your answer!
dreeves
Good call, I edited my answer and added some links.
Michael Pilat