tags:

views:

117

answers:

2

I am planning to modify the file format so that each field should be enclosed in by double quotes mandatory "A","Field1","Field2","Field3","Fi"el,d","Fi""eld", I want the separator to be combined i.e to be ", (double quotes followed by comma) how do i change the below split command to include two separator ", (double quote and comma) together line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)",15);

+2  A: 

how do i change the below split command to include two separator ", (double quote and comma)

This would do it:

line.split("\",");

You'd need to trim the extra quotes that aren't removed by the split. You could also consider splitting on "\",\"" instead.

However, instead of reinventing the wheel, I'd suggest that you try to find an existing CSV reader for your platform. It will be better and faster and a lot less work.

Mark Byers
Thanks a lot. I will try this one. Comma was creating problems when double quotes are in data. so i want to use two separators combined. You responded "You'd need to trim the extra quotes that aren't removed by the split" . I am not getting this. Are you denoting the last field in the line?
Arav
I just wrote an answer suggesting a few CSV libraries, then noticed that Mark had already suggested using an existing CSV library. SuperCSV looks pretty good at a glance, but there are at least 4 others which should also do the job.
rob
+1  A: 

In our application we also supported comma-separated files for years. All went well until customers started to add double quotes into strings. We solved that problem by also allowing the values to be embedded in single quotes (and not allowing single quotes between double quotes, or double quotes between single quotes), but then customers wanted to add both single and double quotes in strings, or couldn't generate this file in an easy way anymore because the embracing characters depended on the values.

Then we started supporting backslashes, but things only became worse.

We finally solved the problem by using TAB as separator (instead of comma). TAB's never appear in string values. No quotes needed anymore. Problem solved.

Patrick
Thanks a lot. Already some of the systems developed. so i can't change the delimiter now.
Arav