views:

335

answers:

6

I have the string "MO""RET" gets stored in items[1] array after the split command. After it get's stored I do a replaceall on this string and it replaces all the double quotes. But I want it to be stored as MO"RET. How do i do it. In the csv file from which i process using split command Double quotes within the contents of a Text field are repeated (Example: This account is a ""large"" one"). So i want retain the one of the two quotes in the middle of string if it get's repeated and ignore the end quotes if present . How can i do it?

String items[] = line.split(",(?=([^\"]*\"[^\"]*\")*[^\"]*$)");
items[1] has "MO""RET"
String recordType = items[1].replaceAll("\"","");

After this recordType has MORET I want it to have MO"RET

+1  A: 

How about:

String recordType = items[1].replaceAll( "\"\"", "\"" );
PSpeed
Thanks a lot. In case if the my string has a value of "TEST"REPLA". If there is only one single double quote in the middle of the string how can i delete the first ,last quote and retain all the middle quote. I want the output as TEST"REPLA Example 2 : "EXAM"PLE"2IN" I want the output as EXAM"PLE"2IN First and last quotes needs to be deleted
Arav
It's difficult to do this with regex and cover the case where there is one starting quote and no ending quote, etc.. And the regex starts to get really complicated. You are really starting to get better off parsing the whole line. If you really just want the specific start/end quote case then just check for this with charAt() and do a substring. It will be faster than regex anyway.
PSpeed
+2  A: 

Don't use regex to split a CSV line. This is asking for trouble ;) Just parse it character-by-character. Here's an example:

public static List<List<String>> parseCsv(InputStream input, char separator) 
    throws IOException 
{
    BufferedReader reader = null;
    List<List<String>> csv = new ArrayList<List<String>>();
    try {
        reader = new BufferedReader(new InputStreamReader(input, "UTF-8"));
        for (String record; (record = reader.readLine()) != null;) {
            boolean quoted = false;
            StringBuilder fieldBuilder = new StringBuilder();
            List<String> fields = new ArrayList<String>();
            for (int i = 0; i < record.length(); i++) {
                char c = record.charAt(i);
                fieldBuilder.append(c);
                if (c == '"') {
                    quoted = !quoted;
                }
                if ((!quoted && c == separator) || i + 1 == record.length()) {
                    fields.add(fieldBuilder.toString().replaceAll(separator + "$", "")
                        .replaceAll("^\"|\"$", "").replace("\"\"", "\"").trim());
                    fieldBuilder = new StringBuilder();
                }
            }
            csv.add(fields);
        }
    } finally {
        if (reader != null) try { reader.close(); } catch (IOException logOrIgnore) {}
    }
    return csv;
}

Yes, there's little regex involved, but it only trims off ending separator and surrounding quotes of a single field.

You can however also grab any 3rd party Java CSV API.

BalusC
Thanks a lot. Thanks a lot. In case if the my string has a value of "TEST"REPLA". If there is only one single double quote in the middle of the string how can i delete the first ,last quote and retain all the middle quote. I want the output as TEST"REPLA Example 2 : "EXAM"PLE"2IN" I want the output as EXAM"PLE"2IN First and last quotes needs to be deleted
Arav
The posted code example already does that (assuming that your CSV file adheres the RFC4180 as outlined here http://www.rfc-editor.org/rfc/rfc4180.txt ).
BalusC
A: 

I prefer you to use replace instead of replaceAll. replaceAll uses REGEX as the first argument.

The requirement is to replace two continues QUOTES with one QUOTE

String recordType = items[1].replace( "\"\"", "\"" );

To see the difference between replace and replaceAll , execute bellow code

recordType = items[1].replace( "$$", "$" );
recordType = items[1].replaceAll( "$$", "$" );
Sreejesh
Thanks a lot. In case if the my string has a value of "TEST"REPLA". If there is only one single double quote in the middle of the string how can i delete the first ,last quote and retain all the middle quote. I want the output as TEST"REPLA Example 2 : "EXAM"PLE"2IN" I want the output as EXAM"PLE"2IN First and last quotes needs to be deleted
Arav
A: 

Here you can use the regular expression.

recordType = items[1].replaceAll( "\\B\"", "" ); 
recordType = recordType.replaceAll( "\"\\B", "" ); 

First statement replace the quotes in the beginning of the word with empty character. Second statement replace the quotes in the end of the word with empty character.

Sreejesh
A: 

Thanks a lot. I want to vote to close the answer. But it's saying I require 15 reputations not sure how to close the question.

Arav
A: 

I want to vote to close the question. But if i click "Up Arrow" saying I require 15 reputations not sure how to close the question. Also i dont see any tick icon to accept the answer

Arav