tags:

views:

242

answers:

4

I am after a Regex expression that will strip out white spaces when there is two or more repeated, leaving just one space behind.

For example this line

The cow  jumped    over the moon

which has multiple spaces separating the words in some cases would become

The cow jumped over the moon
+7  A: 

Try this regular expression:

[ ]+

and replace it with a single space.

Gumbo
this won't match tabs afaik
annakata
\s would be a better representation.
Cerebrus
Ah, you’re right. Appearently I skipped the “white” in “white space” and just read “space”.
Gumbo
A: 

The way I usually do it is repeatedly replace two spaces with one until no more entries found. This of course means multiple passes and several rounds of replacements w/ each round allocating and garbage-collecting a string, but I found the overhead to be quite less than parsing and executing a regex. Even with 64 spaces it only takes 7 passes to fix. Besides, typical strings only have 2-5 spaces so it works even faster.

zvolkov
I can't believe this is faster under any circumstances where you could keep a static regex, and it's certainly less flexible
annakata
+13  A: 
string cleanedString = Regex.Replace(input, @"\s+", " ");
Jeff Moser
A: 

I do this all the time with sed.

$ echo "The cow  jumped    over the moon" | sed -e 's/[     ]\+/ /g'
The cow jumped over the moon

In the character-class square brackets you have a space and a tab character. I quoted the '+' with '\', which may not be necessary if your regex engine takes '+' to mean "one-or-more" rather than a literal '+'.

Jason Catena