tags:

views:

120

answers:

4

How do I split this comma+quote delimited String into a set of strings:

String test = "[\"String 1\",\"String, two\"]"; 
String[] embeddedStrings = test.split("<insert magic regex here>");
//note: It should also work for this string, with a space after the separating comma: "[\"String 1\", \"String, two\"]";    

assertEquals("String 1", embeddedStrings[0]);
assertEquals("String, two", embeddedStrings[1]);

I'm fine with trimming the square brackets as a first step. But the catch is, even if I do that, I can't just split on a comma because embedded strings can have commas in them. Using Apache StringUtils is also acceptable.

A: 

This is extremely fragile and should be avoided, but you could match the string literals.

Pattern p = Pattern.compile("\"((?:[^\"]+|\\\\\")*)\"");

String test = "[\"String 1\",\"String, two\"]";
Matcher m = p.matcher(test);
ArrayList<String> embeddedStrings = new ArrayList<String>();
while (m.find()) {
    embeddedStrings.add(m.group(1));
}

The regular expression assumes that double quotes in the input are escaped using \" and not "". The pattern would break if the input had an odd number of (unescaped) double quotes.

Matthew
+1  A: 

If you can remove [\" from the start of the outer string and \"] from the end of it to become:

      String test = "String 1\",\"String, two";

You can use:

     test.split("\",\"");
Moro
I ended up going with this. It's ugly, as most regex is, but it's effective and my options are limited: String noBrackets = StringUtils.substringBetween(test, "[\"", "\"]"); String[] results = noBrackets.split("\",[ ]*\"");
emulcahy
A: 
Ed Griebel
+3  A: 

You could also use one of the many open source small libraries for parsing CSVs, e.g. opencsv or Commons CSV.

Mirko Nasato