tags:

views:

109

answers:

3

Hey guys how can i expand this code:

string BBCSplit = Regex.Replace(BBC, @"<(.|\n)*?>", string.Empty);

how can i expand this to remove any special characters e.g. : ; , etc. but it still do what it does now which is remove div tags.

I dont fully understand the syntax of regex's yet sorry!

Thanks,

Ash

A: 

You can add more with the alternation character (pipe or |).

Arnshea
+4  A: 
string BBCSplit = Regex.Replace(BBC, @"<(.|\n)*?>|[:;]", string.Empty);

By careful if the "special" characters you want to remove include '"", ']', etc.--you'll need to put a '\' before them.

MarkusQ
This looks right, I would add "\," to it based on the examples. and also I was wondering, should the characters inside the []s be separated by a comma(or perhaps even a pipe)?? so something like [:,\,,;]
gnomed
@gnomed -- No, the characters in the square brackets should not be separated by anything.
MarkusQ
A: 

There are plenty of ways to do it in RegEx, Markus's answer to remove quotes, brackets, newlines, and punctuation would be: (remember to double your doublequote within an @ string)

@"<(.|\n)*?>|[:;,!@#$%^&*()-_+='""[\]]"

Another method would be to remove any non-space and non-alphanumeric character.

@"<(.|\n)*?>|[^\s\w]"

I would suggest being more strict with your RegEx. If you want to remove just tags, go with:

@"</?\w*(.|\s)*?>|[^ \w]"
James