views:

702

answers:

3
+3  Q: 

Split CSV String

How would I split the following string?

test, 7535, '1,830,000', '5,000,000'

The result should be

test
7535
'1,830,000'
'5,000,000'

I try:

Dim S() as string = mystring.split(",")

But I get,

test
7535
'1
830
000'
'5
000
000'

Thanks

+6  A: 

Don't parse CSV manually when you have handy good quality libraries available. Please!

CSV parsing has many many potential pitfalls and this library, according to my testing, solves most of them neatly.

That said, if this is a one off task and the strings are always like your example, you can use regex, like this (VB.NET syntax might be wrong, please fix):

        Dim s as string = "1, 2, '1,233,333', '8,444,555'";
        Dim r as Regex = new Regex(",\s");
        Dim re() as string = r.Split(s);

This counts on that there is always a space after the separating comma and that there is no space in the commas between the numbers. If that's not always the case you can:

  • Make the regex more complex (look here to see how messy things could get)
  • Use the library and be happier
Vinko Vrsalovic
I disagree about the need to necessarily use a CSV library. If you know the CSV file is well-formatted, then a simple method using ReadLine and Split will do the job perfectly well.
Noldorin
However, in this situation it is indeed advisable, given that there are comma-delimited numbers in fields.
Noldorin
So, in other words, you are saying you completely agree with me :)
Vinko Vrsalovic
Great pair programming guys :)
RedFilter
You're assuming the OP is parsing a file's worth of CSV. Taking a dependence on a CSV parsing library would certainly be overkill to split a string (which is what was asked).
Mark Brackett
Solved! Thank you very much!
@Mark: I agree, and is one of the meanings I meant with 'a one off task' (third paragrah). Because it seems to me very likely he's actually parsing a CSV file instead of only one string I added the warning. Additionally, this library is very small and efficient so it's not a big concern.
Vinko Vrsalovic
A: 
Dim words as New List(Of String)()
Dim inQuotes as Boolean
Dim thisWord as String
For Each c as Char in String
    If c = "'"c Then inQuotes = Not inQuotes
    If c = ","c AndAlso Not inQuotes Then
        words.Add(thisWord)
        thisWord = Nothing
    Else
        thisWord &= c
    End If
Next
Mark Brackett
A: 

Try to use this RegExp: "('([^']|'')*'|[^',\r\n]*)(,|\r\n?|\n)?"

iburlakov