tags:

views:

68

answers:

2

I have a list of legal characters and I want to remove all others chars from text.

// my legal chars. a-Z, numbers, space, _, - and percentage
string legalChars = "[\p{L}\p{Nd}_\- %]*"

string text = "[update], Text with {illegal} chars such as: !? {}";

I do find a lot of examples for removing illegal chars. I want to do the opposite.

+1  A: 

How about:

String trimmed = Regex.Replace(input, @"[^\p{L}\p{Nd}_\- %]", "");

Or:

private static readonly Regex RemovalPattern 
   = new Regex(@"[^\p{L}\p{Nd}_\- %]");

...


string trimmed = RemovalPattern.Replace(input, "");

Note that your regex of legal characters currently doesn't include space, contrary to the comment.

Jon Skeet
I think he was hoping would match a space. % would be the percent char he's referring to (%).Placing the dash (-) at the end of the list of characters would eliminate the need to escape it.Also, if you're going to use the Regex a lot, the RegexOptions.Compiled option may improve performance:private static readonly Regex RemovalPattern = new Regex(@"[^\p{L}\p{Nd}_ %-]", RegexOptions.Compiled);
Arne Sostack
Thanks Arne. Jon has put me on the right track. But I still had to replace the XML save notation.
W0ut
A: 

Why not loop through the string yourselfa and check for each character if it's a legal char append the char to a new string (for example with stringbuilder)

PoweRoy
Whehe Regex is way simpeler, never mind this post. Never knew you can replace with regex (never use it)
PoweRoy