I have a string and what to
- remove all characters except all english letters (a..z)
- replace all whitespaces sequences with a single whitespace
How would you do that with C# 3.0 ?
I have a string and what to
How would you do that with C# 3.0 ?
Using regular expressions of course!
string myCleanString = Regex.Replace(stringToCleanUp, @"[\W]", "");
string myCleanString = Regex.Replace(stringToCleanUp, @"[^a-zA-Z0-9]", "");
Regex (edited)?
string s = "lsg @~A\tSd 2£R3 ad"; // note tab
s = Regex.Replace(s, @"\s+", " ");
s = Regex.Replace(s, @"[^a-zA-Z ]", ""); // "lsg A Sd R ad"
I think you can do this with regular expression .What Marc and boekwurm mentioned.
Try these links also http://www.c-sharpcorner.com/UploadFile/prasad_1/RegExpPSD12062005021717AM/RegExpPSD.aspx
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.aspx
note : [a-z] :A range of characters. Matches any character in the specified range. For example, “[a-z]” matches any lowercase alphabetic character in the range “a” through “z”.
Regular expressions also provide special characters to represent common character ranges. You could use “[0-9]” to match any numeric digit, or you can use “\d”. Similarly, “\D” matches any non-numeric digit. Use “\s” to match any white-space character, and use “\S” to match any non-white-space character.
Of course the Regex solution is the best one (i think). But someone HAS to do it in LINQ, so i had some fun. There you go:
bool inWhiteSpace = false;
string test = "lsg @~A\tSd 2£R3 ad";
var chars = (test.Where(c => ('a' <= c && c <= 'z') || ('A' <= c && c <= 'Z') || char.IsWhiteSpace(c))
.Select(c => {
c = char.IsWhiteSpace(c) ? inWhiteSpace ? char.MinValue : ' ' : c;
inWhiteSpace = c == ' ' || c == char.MinValue;
return c;
})
.Where(c => c != char.MinValue);
string result = new string(chars.ToArray());