views:

104

answers:

4

Hi,

I am trying to convert a String into an ArrayList. For example, my Struts2 webapp returns this String named row in a format similar to this:

[A, BB, CCC, DDDD, 1, 0, 1] (something along those lines)

I need to convert them into an ArrayList so I can prepopulate some forms in another JSP page. I hardcoded a method to convert such Strings into list form:

        StringBuffer rowBuffer = new StringBuffer(row);
        int startIndex = 0;
        int endIndex = rowBuffer.indexOf(",") - 1;
        rowBuffer.deleteCharAt(rowBuffer.indexOf("["));
        rowBuffer.deleteCharAt(rowBuffer.indexOf("]"));
        while(startIndex != -1 && endIndex != -1 && startIndex < endIndex)
        {
          String subString = rowBuffer.substring(startIndex, endIndex);

       if(subString.contains(","))
       {
          rowList.add(" ");
          startIndex = endIndex + 1;
          endIndex = rowBuffer.indexOf(",", startIndex);
       }

       else
       {
        rowList.add(subString);
        startIndex = endIndex + 2;
        endIndex = rowBuffer.indexOf(",", startIndex + 1);
       }

       if(endIndex == -1)
       {
          rowList.add(rowBuffer.substring(startIndex));
          break;
       }
    }

This works fine in cases where all the fields are populated. However, lets say I have a String that looks like this: [A, BB, , , 1, 0, 0] (the 3rd and 4th fields are missing), then I get something that doesn't work (the blank elements don't register correctly, and the size of the list is 6, when it should be 7). Is there a more elegant solution than hardcoding? If not, could someone point me in the right direction on how to handle cases with blank fields? Thanks!

+1  A: 

Try this please:

import java.util.regex.*;

// ...

// Working code    
rowBuffer.deleteCharAt(rowBuffer.indexOf("["));
rowBuffer.deleteCharAt(rowBuffer.indexOf("]"));
// Create a pattern to match breaks
Pattern p = Pattern.compile("\\s*,\\s*");
// Split input with the pattern
String[] result = p.split(rowBuffer);
rowList = new ArrayList(Arrays.asList(result)); 

NOTE: this pre-supposes that the strings themselves do not contain commas and are not quoted. If you want to parse real CSV with commas in the fields and quoted values, do NOT use regular expressions and split; and instead use a dedicated state machine CSV parser (here's one example: http://opencsv.sourceforge.net/ - or you can roll your own, like BalusC example here)

DVK
Hi DVK,In some cases, I have spaces between Strings (example: [AAA, BB BBB, CCC, DDDD, 1, 0, 1]), but I want to get "BB BBB", not "BB", and "BBB". I don't really know regular expressions yet... so could you help me?
Raymond
Looks to me like @DVK's code handles spaces immediately before and after the comma delimiter... --JA
andersoj
@andersoj - yes, you are correct
DVK
@Raymond Chang - `\s*,\s*` means match a comma surrounded on both left and right by optional whitespace. Exactly what you need.
DVK
@DVK: You wrote `compile("[\\s*,\\s*]+")`; you misunderstood how character class `[...]` works. This will match asterisks.
polygenelubricants
What if the strings contain `commas` themselves? :/
st0le
@polygenelubricants - will fix, you're correct of course *facepalm*
DVK
@st0le - if the string contains commas, using regular expressions to parse it is a Bad Idea and you should use a dedicated CSV parser (write your own if Java doesn't provide anything like Perl's Text::CSV_XS - it's messy and tricky but doable and easier than many other parsers). I noted that in updated answer
DVK
A: 

Well I changed my code a bit and made it work (it's probably far from the most elegant solution, but it works for me...

    StringBuffer rowBuffer = new StringBuffer(row);
    int startIndex = 0;
    int endIndex = rowBuffer.indexOf(",") - 1;
    rowBuffer.deleteCharAt(rowBuffer.indexOf("["));
    rowBuffer.deleteCharAt(rowBuffer.indexOf("]"));
    while(startIndex != -1 && endIndex != -1 && startIndex < endIndex)
    {
      String subString = rowBuffer.substring(startIndex, endIndex);

       if(subString.contains(","))
       {
          rowList.add(" ");
          startIndex = endIndex - 1;
          endIndex = rowBuffer.indexOf(", ", startIndex + 1);
       }

       else
       {
         if(subString.equals("1"))
             rowList.add("True");
         else if(subString.equals("0"))
             rowList.add("False");
         else
          rowList.add(subString);
        startIndex = endIndex + 2;
        endIndex = rowBuffer.indexOf(",", startIndex + 1);
       }

       if(endIndex == -1)
       {
          if(subString.equals("1"))
             rowList.add("True");
          else if(subString.equals("0"))
             rowList.add("False");
          break;
       }
    }
Raymond
A: 

Assuming that the format is as specified by AbstractCollection.toString(), then you can simply:

  • Remove the surrounding brackets (with simple substring)
  • Then split on ", " (comma and space)
  • Wrap the String[] into a List<String> using Arrays.asList
    • Use that to populate an ArrayList<String> if necessary

Note that this will break if the elements themselves can contain ", ". For this to work, that string must be a delimiter, and never part of the actual token.

Here's a snippet to illustrate:

    String s = "[A, BB, CCC, DDDD, 1, 0, 1]";
    String[] parts = s.substring(1, s.length() - 1).split(", ");
    List<String> list = new ArrayList(Arrays.asList(parts));

    for (String part : list) {
        System.out.print("{" + part + "} ");
    } // {A} {BB} {CCC} {DDDD} {1} {0} {1} 
polygenelubricants
A: 

you can try

// Remove '[' and ']'
String[] splitArray = temp.substring(1,temp.length()-1).split(",");
// Contain return value
List<String> returnValue= new ArrayList<String>();
for (int i = 0; i < splitArray.length; i++) 
    if(!splitArray[i].equals(" "))  
        returnValue.add(splitArray[i]);
VinAy