tags:

views:

73

answers:

6

I need regular expression to match braces correct e.g for every open one close one abc{abc{bc}xyz} I need it get all it from {abc{bc}xyz} not get {abc{bc} I tried this ({.*?})

+1  A: 

This is not a task for a regular expression. What you're looking for is parser at that point. Which means a language grammar, LL(1), LALR, recursive-descent, the dragon book, and generally a splitting migraine.

Bryan Ross
Yes, I know... not a horribly helpful answer, but I still have nightmares about the Dragon Book. ***shudder***
Bryan Ross
But that's the fun stuff!
guns
+2  A: 

This is not possible with regular expressions. A context-free grammar would be necessary for this and regular expressions only work for finite regular languages.

According to this link there is an extension available for the regular expressions in .NET that can do this, but this just means that .NET regular expressions are more than just regular expressions.

Trey
A: 

This is not possible in the "standard" regular expression language. However, a few different implementations have extensions that allow you to implement it. For example, here's a blog post that explains how to do it with .NET's regex library.

Generally speaking though, this is a task that regular expressions are not really suited to.

Dean Harding
+1  A: 

As Bryan said, regular expressions might not be the right tool here, but if you're using PHP, the manual gives an example of how you might be able to use regular expressions in a recursive/nested fashion:

$input = "plain [indent] deep [indent] deeper [/indent] deep [/indent] plain";

function parseTagsRecursive($input)
{

    $regex = '#\[indent]((?:[^[]|\[(?!/?indent])|(?R))+)\[/indent]#';

    if (is_array($input)) {
        $input = '<div style="margin-left: 10px">'.$input[1].'</div>';
    }

    return preg_replace_callback($regex, 'parseTagsRecursive', $input);
}

$output = parseTagsRecursive($input);

echo $output;

I'm not sure if that'll be helpful to you or not.

nickf
A: 

Assuming what you want to do is select a maximal substring between { and }:

.*? is a lazy quantifier. That is, it will match the least number of characters possible. If you change your expression to {.*}, you should find it will work.

If what you want to do is to verify that the braces are matched correctly, then as the other answers have stated, this is not possible with a (single) regular expression. You can do it by scanning the string with a stack though. Or with some voodoo of iterating your regular expression over the previous maximal match. Yikes.

Nick Lewis
Not all regex engines support lazy quantifiers so if you do use them make sure yours supports it. Here's a related SO question: http://stackoverflow.com/questions/546433/regular-expression-to-match-outer-brackets
Trey
@Nick Lewis: I think you mean that lazy quantifiers will match the *most* number of characters possible, right?
Trey
@Trey Greedy quantifiers match the most characters possible, lazy quantifiers match the fewest. It's intuitive in that the lazy quantifier will stop as soon as possible (lazy) and the greedy quantifier will consume as much as possible (greedy).
Nick Lewis
@Nick Lewis: You're correct. I think another way to match lazily in this special case would be: `{[^}]*}`
Trey
+1  A: 
polygenelubricants
+1 for *"...many "regular expression" implementations actually recognize more than regular languages..."*
Bart Kiers
@Bart: That's why I always say "regex" instead. http://feather.perl6.nl/syn/S05.html#VERSION
Alan Moore
One for the bookmarks, thanks @Alan!
Bart Kiers