views:

165

answers:

3

Is it possible to convert a properly formed (in terms of brackets) expression such as

((a and b) or c) and d

into a Regex expression and use Java or another language's built-in engine with an input term such as ABCDE (case-insensitive...)?

So far I've tried something along the lines of (b)(^.?)(a|e)* for the search b and (a or e) but it isn't really working out. I'm looking for it to match the characters 'b' and any of 'a' or 'e' that appear in the input string.

About the process - I'm thinking of splitting the input string into an array (based on this Regex) and receiving as output the characters that match (or none if the AND/OR conditions are not met). I'm relatively new to Regex and haven't spent a lot of time on it, so I'm sorry if what I'm asking about is not possible or the answer is really obvious.

Thanks for any replies.

A: 

No. A regex isn't computationally powerful enough to make sure that the opening and closing parentheses match. You need something that can describe it using a formal grammar.

Dan Monego
+1  A: 

The language of strings with balanced parentheses is not a regular language, which means no (pure) regular expression will match it.

That is because some kind of memory construct, usually a stack, is needed to maintain open parentheses.

That said, many languages offer recursive evaluation in regexes, notably Perl. I don't know the fine details, but I'm not going to bother with them because you can probably write your own parser.

Just iterate over every character in the string and keep track of a counter of open parentheses and a stack of strings. When you get to an open parentheses, push the stack in and put characters that aren't parentheses into string of the stack. When you get to a closed parentheses, evaluate the expression that you had built up and store the result onto the back of the string that's on the top of the stack.

Then again, I'm not fully sure I understand what you're doing. I apologize, then, if this is no help.

Platinum Azure
I was trying to get a recursive solution using normal String compare working before posting here, but I thought there would perhaps be an easy way using Regex. Thanks for the suggestions.
raptors
+1  A: 

I'm not entirely certain I understand what you're trying to do, but here's something that might help. Start with something like

((a and b) or c) and d

And pass it through these substitution statements:

s/or/|/g
s/and| //g
s/([^()|])/(?=.*$1)/g

That will give you

(((?=.*a)(?=.*b))|(?=.*c))(?=.*d)

which is a regex that will match what you want.

Beta
I couldn't get this to work on ABCD with an online tester such as http://regexpal.com/ . Thanks, but I've decided to go with a more conventional recursive String search approach as recommended above.
raptors
Sadly, there are different flavors of regex engine. The (?=foo) stuff is "lookahead", which Perl can handle, but apparently regexpal can't. I don't know what you're using, but my approach works under Perl and, according to regextester, Javascript.
Beta