tags:

views:

258

answers:

4

I need to parse a string so the result should output like that:

"abc,def,ghi,klm,nop"

But the string I am receiving could looks more like that:

",,,abc,,def,ghi,,,,,,,,,klm,,,nop"

The point is, I don't know in advance how many commas separates the words.
Is there a regex I could use in C# that could help me resolve this problem?

+3  A: 

Search for ,,+ and replace all with ,.

So in C# that could look like

resultString = Regex.Replace(subjectString, ",,+", ",");

,,+ means "match all occurrences of two commas or more", so single commas won't be touched. This can also be written as ,{2,}.

Tim Pietzcker
+4  A: 

You can use the ,{2,} expression to match any occurrences of 2 or more commas, and then replace them with a single comma.

You'll probably need a Trim call in there too, to remove any leading or trailing commas left over from the Regex.Replace call. (It's possible that there's some way to do this with just a regex replace, but nothing springs immediately to mind.)

string goodString = Regex.Replace(badString, ",{2,}", ",").Trim(',');
LukeH
Thanks Luke, it does the trick pretty well !
Raphyboy
+2  A: 

a simple solution without regular expressions :

string items = inputString.Split(new[] { ',' }, StringSplitOptions.RemoveEmptyEntries);
string result = String.Join(",", items);
Thomas Levesque
Good to know that solution ! Thanks !
Raphyboy
+2  A: 

Actually, you can do it without any Trim calls.

text = Regex.Replace(text, "^,+|,+$|(?<=,),+", "");

should do the trick.

The idea behind the regex is to only match that, which we want to remove. The first part matches any string of consecutive commas at the start of the input string, the second matches any consecutive string of commas at the end, while the last matches any consecutive string of commas that follows a comma.

dionadar