Correctly parsing CSV files is more tricky than it might seem at the first sight, you'd need at least:
- Honour the original text encoding
- Make sure you can import escaped delimiters, i.e.: 23,10/02/2010,"hello, world",34.5
- Apply correct date format and decimal point format depending on the file locale
- Treat the quotes correctly
If it's a quick task I suggest using an existing library, there is at least two open-source CSV libs for Java with a very similar API:
- Java CSV Library
- OpenCSV
I've tried both starting with OpenCSV and it threw a OutOfMemory exception when just evaluating a file line by line since I had a 600MB CSV file. Apparently there is a memory leak in the current lib.
I didn't have time to debug, so I just switched to Java CSV since the have surprisingly similar API's for basic operations and it worked like a charm.
Java CSV would allow you accessing columns either by index or column name (in case there's a header within the file).
UPDATE
Using Java CSV Lib you'll have to do something along these lines to access individual rows (quick'n'dirty, might not compile):
import com.csvreader.CsvReader;
class Parser {
public static void main (String [] args) throws Throwable {
CsvReader reader = new CsvReader("input file name.csv",
',' /* delimiter */ );
while (reader.readRecord()) {
// full row, you can use regex to find
// any rows you specifically want
String row = reader.getRawRecord();
// get value of the first field
String col = reader.get(0);
// gets array of fields
String[] cols[] = reader.getValues();
}
reader.close();
}
}