tags:

views:

50

answers:

2

Hi,

I would like to understand what the following code is doing. This logic is part of a routine to strip out html from the body of an email message.

mBBSREgEx.IgnoreCase = True
mBBSREgEx.Global = True
mBBSREgEx.Pattern = "<[^>]*>"
sResult = mBBSREgEx.Replace(sResult, "")

Thank you, Jim

Ok, if I wanted to change the routine to strip out tags constructed like [b]Hello[/b] instead of the existing one that strils out "<" and ">" tags, how would I change the pattern?

Would this work?

mBBSREgEx.Pattern = "\[[^\]]*\]"

Thanks, Jim

+4  A: 

It replaces all tags with empty strings.

It looks for an opening <, then takes as many characters as possible that are not >, then it finds a closing >. Finally, it replaces all of that (everything from < to >) with an empty string, effectively stripping out HTML tags.

Mark Rushakoff
+1 for explanation rather than just an answer.
richsage
Thanks a bunch!
Jim
A: 

The following pattern would strip out tags constructed like [b]Hello[/b] instead of the existing one that strips out "<" and ">" tags:

mBBSREgEx.Pattern = r"\[[^\]]*\]"
jkasnicki
Thankyou, perfect.
Jim