tags:

views:

68

answers:

5

I'm trying to parse a csv file into a 2d array, where each row is a data entry and each column is a field in that entry.

Doing this all at once simplifies and separates my processing code from my parsing code.

I tried to write a simple parser that used String.Split to separate file by commas. This is a horrible approach as I have discovered. It completely fails to parse any special cases like double quotes, line feeds, and other special chars.

What is the proper way to parse a CSV file into a 2d array as I have described?

Code samples in Java would be appreciated. The array can be a dynamic list object or vector or something like that, it just has to be indexable with two indexers.

+3  A: 

Why don't use a library? e.g.: http://opencsv.sourceforge.net/#where-can-I-get-it

itsme
A: 

Have a look at Commons CSV?

CSVParser parser = new CSVParser(new FileReader(file));
String[] line;
while ((line = parser.getLine()) != null) {
     // process
}
Karl Johansson
A: 

Here is another reader.

Bart
A: 

If your file has fields with double quoted entries that contain separators and fields with line feeds, than I doubt that it is a real csv file... a proper csv file is something like this

1;John;Doe;engineer,manager
2;Bart;Foo;engineer,dilbert

while this is "something else":

1;John;Doe;"engineer;manager"
2;Bart;Foo;
   "engineer,dilbert"

And the first example is parseable with String.split on each line.

Andreas_D
CSV files can be much more complicated. Read this formal spec here: http://supercsv.sourceforge.net/csvSpecification.html and you'll see that newlines, double quotes and such are allowed within quotations
CodeFusionMobile
A: 

We had this same problem a couple of months ago, so then we created a solution in C#, with about a reader that implements an IEnumerable interface, and in each interation reads a new line of the csv.

I don't know if I can provide the code, if you have interest in the solution I can go further in the details to help you create new one.

Leo Nowaczyk