ansaurus

Question

Answer 1

A:

How about

<[^a](.|\n)+?>

?

Jimmy 2008-09-04 16:09:10

Answer 2

A:

@Jimmy: Close, this will prevent the starting a from being stripped, but not the closing a.

Jeff Winkworth 2008-09-04 16:12:50

You should've probably posted a comment on his answer instead.

Andreas Bonini 2009-12-28 21:17:13

SO didn't have comments in September 2008.

Jeff Winkworth 2010-01-05 20:45:21

Answer 3

+5 A:

<(?!\/?a(?=>|\s.*>))\/?.*?>

Try this. Had something similar for p tags. Worked for them so don't see why not. Uses negative lookahead to check that it doesn't match a (prefixed with an optional / character) where (using positive lookahead) a (with optional / prefix) is followed by a > or a space, stuff and then >. This then matches up until the next > character. Put this in a subst with

s/<(?!\/?a(?=>|\s.*>))\/?.*?>//g;

This should leave only the opening and closing a tags

Xetius 2008-09-04 16:29:23

Answer 4

A:

@Xetius: Worked brilliantly! Thanks!

update: One minor tweak is required for AS3 implementation: <(?!\/?a(?=>|\\s.*>))\/?.*?>

That's a \\s, instead of a \s

Jeff Winkworth 2008-09-04 16:51:49

Answer 5

A:

I keep going on about it, but there's no way I can recommend regexr too often. It's fantastic for testing this type of things.

grapefrukt 2008-09-05 12:41:37

Answer 6

A:

@grapefrukt: I'm actually a regular user of RegExBuddy, but I couldn't figure out the excluding a tag. ;-)

Jeff Winkworth 2008-09-05 12:57:20

Answer 7

A:

In general there are problems with this approach. Regexes are best for 'flat' text matches - nested data pushes regex engines into areas for which they are not designed. General HTML parsing needs a parser not a regex engine (Google for the difference between regular and context-free languages if you want the full technical details).

It is easy to strip out all tags by replacing /</ and />/ with the empty string or their entity equivalents but selectively filtering HTML using regexes will be vulnerable to a wide range of accidental or malicious inputs breaking things.

domgblackwell 2008-09-22 11:36:53

Answer 8

A:

Here you go:

{<(?!i|b|h[1-6]|/i|/b|/h[1-6][\s|>|/])[^>]*>}

Qamar 2009-12-28 08:06:21

ansaurus

tags:

views:

answers:

Strip all HTML tags except links

related questions