views:

165

answers:

5

What is a regexp that accepts everything over the language {0,1} but has no substring 110 or 101?

Accept:

  • 111111
  • 000011111
  • 100001000001001
  • 010
  • 1

Reject:

  • 100110
  • 010100
  • 123

Edit: Per comments on answers below, this question is asking for a formal regular expression.

+6  A: 

You'd be best off checking if it doesn't match /101|110/

Greg
I don't think it's that simple.not matching those 3 characters enforces a string of at least 3.My requirement is any string, including the empty string.
Absolute0
That won't enforce a string length - you just need to do something like (php) if (!preg_match('/101|110/', $str)) {}
Greg
Who said we're using php? :)I am talking about the formal regular expressions, not fancy magic ones.
Absolute0
Bart Kiers
See my comment to the question, though. This answer is wrong, sorry. It would even match strings.
Franz
This would work *if used in conjunction* with a positive check for `\A[01]*\z`. But it's simpler to do it in a single expression with a negative lookahead.
Peter Boughton
roe
A: 

This should work:

/^([01])\1*$/
Franz
It basically checks for there being a zero or one at the beginning of the string, and then multiple (or no) occurrences of that same number after the first digit.
Franz
That would reject `1001001` which does not contain either `101` or `110` so should be accepted.
Bart Kiers
Why are you escaping the "1" in "\1"
Absolute0
@Absolute0, Franz is back referencing what is matched in group 1.
Bart Kiers
I'm not escaping it. It's a backreference and thus checking for the same string that was found inside the first pair of parenthesis.And sorry. Looks like I didn't read everything...
Franz
why not just do /(^([01]))*/ ?
Absolute0
+4  A: 

This seems to work, assuming that your regex engine supports lookahead.

/^(1(?!01|10)|0)*$/
CAdaker
If Absolute0's question is an academic one, look-arounds are out of the question, I guess.
Bart Kiers
+11  A: 

This is the solution (even without lookahead):

/^0*(11*$|10$|100+)*$/
  • Start off with any number of zeros.
  • Loop (know: the string parsed so far does not end with "1" or "10")
    • "1$" is ok (&stop)
    • If you find "11", then you can't read any thing except ones until you reach the end
    • "10$" is ok.
    • If you read "10" and want to go on, read one or more zeros. Then go back to the loop.
Yaakov Belch
I did not test it, but it looks like you nailed it: well done! +1
Bart Kiers
$ is a look-ahead, and is not part of a formal regular expression.
Jeremy Stein
@Jeremy: $ is the "end of string" condition. you can call it \z if you want.
Yaakov Belch
I think a zero-width assertion is technically a look-ahead, but whatever you want to call it, it's not part of a formal mathematical regular expression.
Jeremy Stein
+3  A: 

Limited to formal regular expression notation:

((1|0*|0*1)(000*))*0*(10*|1*)
Jeremy Stein
Can this be simplified?It seems a bit to obfuscated.
Absolute0
I would think so, but my brain got awfully twisted just coming up with this. It's been too long since I studied DFAs and formal regular expressions.
Jeremy Stein