tags:

views:

89

answers:

2

What could be the exact Regex.IsMatch expression for

span[class|align|style]

I tried with this one but i am not getting exact expected result

if (!Regex.IsMatch(n.Value, @"span\[.*?style.*?\]", RegexOptions.IgnoreCase))  
    n.Value = Regex.Replace(n.Value, @"(span\[.*?)(\])", "$1" + ValToAdd + "$2");

I am checking if the span contains 'style' element, if it is present then 'style' will not inserted with 'span' and vice-versa.

Any pointers?

+2  A: 

You forgot to add | before the ValToAdd.

if (!Regex.IsMatch(n.Value, @"span\[.*?\bstyle\b.*?\]", RegexOptions.IgnoreCase))  
    n.Value = Regex.Replace(n.Value, @"(span\[.*?)\]", "$1|" + ValToAdd + "]");

Also, your first regex would match span[class|align|somestyle]. Use word boundary \b to match whole words. Note that this would still match span[class|align|some-style] as \b matches before and after non-word characters. The following regex would match only those styles that are surrounded by [| or || or |].

@"span\[.*(?<=\||\[)style(?=\||\[).*\]"
Amarghosh
i have kept ValToAdd as like this(string ValToAdd = "|style";)
SAK
@sam you said `i am not getting exact expected result` - care to post the output that you got?
Amarghosh
the condition(Regex.IsMatch) which i am checking is what i meant here as expected result..
SAK
-span[class|align|style].is this hiphen in span anywhere related with regex?
SAK
If you're talking about matching things with stray hyphens around them, you can enclose your regex between `^` and `$` - they match start and end of line respectively
Amarghosh
+1  A: 

As much as I like regular expressions, if you're doing this often in your program you'll do better with a small class to represent your tokens. Consider this as a rough sketch:

public class SamToken
{
    public string Head { get; set; }
    private readonly HashSet<string> properties;
    public HashSet<string> Properties{
        get{return properties; }
    }

    public SamToken() : this("") { }

    public SamToken(string head)
    {
        Head = head;
        properties = new HashSet<string>(StringComparer.OrdinalIgnoreCase);
    }

    public void Add(params string[] newProperties)
    {
        if ((newProperties == null) || (newProperties.Length == 0))
            return;
        properties.UnionWith(newProperties);
    }

    public override string ToString()
    {
        return String.Format("{0}[{1}]", Head,
            String.Join("|", Properties));
    }
}

Next, you can use a function to parse a token from a string, something among the line of:

public static SamToken Parse(string str)
{
    if (String.IsNullOrEmpty(str))
        return null;
    Match match = Regex.Match(str, @"^(\w*)\[([\w|]*)\]$");
    if (!match.Success)
        return null;
    SamToken token = new SamToken(match.Groups[1].Value);
    token.Add(match.Groups[2].Value.Split('|'));
    return token;
}

With something like this, it would be easy to add properties:

SamToken token = SamToken.Parse("span[hello|world]");
token.Add("style", "class");
string s = token.ToString(); 

As you can see, I've only put in a few minutes, but your code can be much more robust and more importantly, reusable. You don't have to rewrite that regex every time you want to check or add a property.

Kobi