views:

103

answers:

3

Pretty much what the question says. I came up with

(ba)?(a + bb + bbbbb + aba)*(ab)?

Is there anything more readable? Or is this incorrect? I know you shouldn't really be doing this sorta thing with Regex when you can just go !~/bbb/ in your code, but it's a theory exercise.

Thanks.

Edit for Clarification: I'm not using | to represent the OR bit in the Regex and using + it instead. Sorry for the confusion.

Edit 2: {a,b} is for a language with just 'a' and 'b' characters. Not {mininum, maximum}. Sorry again.

Edit 3: Because this is part of a theory class, we're just dealing with the basics of Regex. The only things you're allowed to use are +, ?, () and *. You cannot use {minimum, maximum).

+1  A: 

Hmm, something like this?

^(a|(?<!b)b{1,2}(?!b)|b{4,})*$

edit:

Edit 3: Because this is part of a theory class, we're just dealing with the basics of Regex. The only things you're allowed to use are +, ?, () and *. You cannot use {minimum, maximum).

Pfff, talking about tying your hands behind your back... Simple solution: you cannot do it (^ & $ are requirements for it ever to work), and we need the |. So, come up with a better conditions. Dropping the lookbehind & lookahead could be done, but isn't going to be pretty (at least, not without violating DRY):

^(b|bb|bbbb+)?(a+(b|bb|bbbb+)?)*$
Wrikken
No, ^ and $ are requirements when expressing the solution in certain regex forms. What they do here is mandate that the entire expression be part of the regular expression, but in theoretical questions the regex may be assumed to be for the entire string. I don't know about `|`; that's generally considered basic (unlike `+`).
David Thornley
I'm sorry, I realize it may be seem extremely inefficient doing it this way, but my lecturer gave us these restrictions and asked us to come up with a solution. Maybe he was trying to make this exact point.
Gaurav Dadhania
A: 

You're matching a string without precisely 3 b's in a row. That means you're looking at substrings like "aa", "aba", "abba", and "abbbbb*a", where any of the exterior a's could be the beginning or end of the string, can be overlapped, and can be multiple. This suggests something like:

(a + ab + abb + abbbbb*)*

with appropriate additions to account for the missing a at the beginning of the string. There's a lot of repetitions, but that's how regular expressions work in basic form.

David Thornley
'bb' or 'bbbbbbb' will not be valid for this Regex, if I'm not wrong.
Gaurav Dadhania
Right - it only accepts strings starting with a. As this is a homework question, I'm happy to help with problems but I'm not posting a complete solution.
David Thornley
+1  A: 

I think I have a working regex. Let —which is a notation I invented just now—be the regex that matches zero or more b's, except it won't match three of them. This can be replaced by (ε | b | bb | bbbb+), so don't worry that I'm using magic or anything. Now I think that matching strings can be seen as repeating subpatterns of zero or more a's followed by , which could be (a*b°)*, but you need there to be at least one "a" in between sequences of b's. So your final regex is a*b°(a+b°)*.

Since can match the empty string, the initial a* is superfluous as the a+ can pick up the initial a's just fine, so the regex can be optimized down to b°(a+b°)* (thanks, wrikken).

Cirno de Bergerac
I've out of upvotes, but yes, this works. A quick note to the OP: While Cirno is using ε (epsilon) to represent the empty string, your textbook may use λ (lambda) instead. The meaning is the unchanged, though.
bcat
if `b°` = _zero or more b's_, you can drop the starting `a*` in `a*b°(a+b°)*` (so, `b°(a+b°)*`)
Wrikken
Just a small question, in the last part '(a+b0)*', does the + mean 'one or more' or 'a choice (OR)'? If you mean the former, an ending with 'a's would not be valid, if I'm not wrong.
Gaurav Dadhania
@Gaurav: yes, it would be valid, as b° can be zero b's.
Wrikken