views:

66

answers:

1

Hey everyone,

First off, sorry for my noob-ness. Believe me when i say ive been rtfm'ing. Im not lazy, im just dumb (apparently). On the bright side, this could earn someone some easy points here.

I'm trying to do a match/replace with a pattern that contains special characters, and running into syntax errors in a Flex 3 app. I just want the following regex to compile... (while also replacing html tags with "")

value.replace(/</?\w+((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)/?>/g, "");

On a side note, the pattern /<.*?>/g wouldn't work in cases where there are html entities between tags, like so:

<TEXTFORMAT LEADING="2">
<P ALIGN="LEFT">
<FONT FACE="Arial" SIZE="11" COLOR="#4F4A4A" LETTERSPACING="0" KERNING="0"><one</FONT>
</P>
</TEXTFORMAT><TEXTFORMAT LEADING="2">
<P ALIGN="LEFT">
<FONT FACE="Arial" SIZE="11" COLOR="#4F4A4A" LETTERSPACING="0" KERNING="0">two</FONT>
</P>
</TEXTFORMAT>

The first regex would get both "<one" and "two", but the second would only get "hi"

Thanks! Stabby L

+1  A: 

Here is what you are looking for:

// strips htmltags
// @param html - string to parse
// @param tags - tags to ignore
public static function stripHtmlTags(html:String, tags:String = ""):String
{
    var tagsToBeKept:Array = new Array();
    if (tags.length > 0)
        tagsToBeKept = tags.split(new RegExp("\\s*,\\s*"));

    var tagsToKeep:Array = new Array();
    for (var i:int = 0; i < tagsToBeKept.length; i++)
    {
        if (tagsToBeKept[i] != null && tagsToBeKept[i] != "")
            tagsToKeep.push(tagsToBeKept[i]);
    }

    var toBeRemoved:Array = new Array();
    var tagRegExp:RegExp = new RegExp("<([^>\\s]+)(\\s[^>]+)*>", "g");

    var foundedStrings:Array = html.match(tagRegExp);
    for (i = 0; i < foundedStrings.length; i++) 
    {
        var tagFlag:Boolean = false;
        if (tagsToKeep != null) 
        {
            for (var j:int = 0; j < tagsToKeep.length; j++)
            {
                var tmpRegExp:RegExp = new RegExp("<\/?" + tagsToKeep[j] + " ?/?>", "i");
                var tmpStr:String = foundedStrings[i] as String;
                if (tmpStr.search(tmpRegExp) != -1) 
                    tagFlag = true;
            }
        }
        if (!tagFlag)
            toBeRemoved.push(foundedStrings[i]);
    }
    for (i = 0; i < toBeRemoved.length; i++) 
    {
        var tmpRE:RegExp = new RegExp("([\+\*\$\/])","g");
        var tmpRemRE:RegExp = new RegExp((toBeRemoved[i] as String).replace(tmpRE, "\\$1"),"g");
        html = html.replace(tmpRemRE, "");
    } 
    return html;
}

see http://fightskillz.com/2010/01/flexactionscript-3-0-strip-html-tags-function/ for more information.

Todd Moses