views:

104

answers:

4

Any ideas?

My program is a file validation utility and I have to read in a format file then parse out each line by a single space. But obviously, the person who wrote the format file may use tabs, or 2 spaces, or any form of whitespace, and I'm looking for some code to do that. I've tried this:

        public static string RemoveWhitespace(this string line) 
        {
            try 
            { 
                return new Regex(@"\s*").Replace(line, " "); 
            } 
            catch (Exception) 
            { 
                return line; 
            }
        }

I'm assuming this is wrong.
Help!

A: 

Could you use String.Replece?

nilphilus
@nilphilus: But that would involve a lot of statements to make sure all matter of spaces were replaced, i.e someone could use 1 space, 2 spaces, 3 spaces, 7 spaces, 10 spaces, so on...
New Start
+6  A: 

You can do this -

System.Text.RegularExpressions.Regex.Replace(str,@"\s+"," ");

where str is your string.

Sachin Shanbhag
@Sachin Shanbhag: I really want to accept this as my answer but it just doesn't seem to work. It just keeps throwing an exception. Also, just a general question; in regards to Regex, does '\s' just mean whitespace?
New Start
@New Start - Can you tell me what the error is? I hope you are using proper namespace right?
Sachin Shanbhag
@New Start - '\s' matches white space character. check this - http://www.regular-expressions.info/charclass.html#shorthand
Sachin Shanbhag
@New Start - I have tried this on my end. it Works fine. If you can tell what is your error, I can help you with that.
Sachin Shanbhag
@Sachin Shanbhag: I was using proper namespace, yes! My problem was I was returning the original line instead of the edited line. Thank you for your help!
New Start
A: 

Why not find the index of the first white space, then .Replace(" ","") and then insert a white space at the initial index?

Brissles
There are a number of problems with that approach, but the most important is that you'd convert `a b c` to `a bc` rather than `a b c`. You're taking out all the spaces, but only inserting a space for the first run of spaces.
stevemegson
Yes, but he's doing it line by line and wants one space per line. Maybe I misunderstood.
Brissles
A: 

This is a duplicate of this question

however the answer is this (credit to Daok)

Regex regex = new Regex(@"[ ]{2,}");     
tempo = regex.Replace(tempo, @" ");
Xander
This doesn’t take care of tabs.
Timwi
My thought exactly. I did actually read that question but it really didn't help my particular situation!
New Start
Point taken... should teach me to read the question more closely, I read "multiple spaces" instead of "whitespaces" which include tabs etc! Apologies
Xander