views:

211

answers:

6

Let L= { w in (0+1)* | w has even number of 1s}, i.e. L is the set of all bit strings with even number of 1s. Which one of the regular expressions below represents L?

A) (0*10*1)*
B) 0*(10*10*)*
C) 0*(10*1)* 0*
D) 0*1(10*1)* 10*

According to me option D is never correct because it does not represent the bit string with zero 1s. But what about the other options? We are concerned about the number of 1s(even or not) not the number of zeros doesn't matter.

Then which is the correct option and why?

+5  A: 

A if false. It doesn't get matched by 0110 (or any zeros-only non-empty string)

B represents OK. I won't bother proving it here since the page margins are too small.

C doesn't get matched by 010101010 (zero in the middle is not matched)

D as you said doesn't get matched by 00 or any other # with no ones.

So only B

DVK
+1 for the Fermat reference!
Jim Lewis
A also fails to match `0*`
BCS
The string "`000`" has an even number of 1s (zero 1s) but the A regex doesn't match it. (I guess I should have said that the A regex doesn't match `0+` as it does get the empty string). --- I pointed it out because It's an important corner case that hadn't been brought up and I did so *here* because I didn't think it was worth it's own answer.
BCS
Ah. OK, gotcha... Updated! Thanks!
DVK
A: 

Look for examples that should match but don't. 0, 11011, and 1100 should all match, but each one fails for one of those four

Michael Mrozek
A: 

C is incorrect because it does not allow any 0s between the second 1 of one group and the first 1 of the next group.

Ignacio Vazquez-Abrams
A: 

a quick python script actually eliminated all the possibilities:

import re

a = re.compile("(0*10*1)*")
b = re.compile("0*(10*10*)*")
c = re.compile("0*(10*1)* 0*")
d = re.compile("0*1(10*1)* 10*")

candidates = [('a',a),('b',b),('c',c),('d',d)]
tests = ['0110', '1100', '0011', '11011']
for test in tests:
    for candidate in candidates:
        if not candidate[1].match(test):
            candidates.remove(candidate)
            print "removed %s because it failed on %s" % (candidate[0], test)

ntests = ['1', '10', '01', '010', '10101']
for test in ntests:
    for candidate in candidates:
        if candidate[1].match(test):
            candidates.remove(candidate)
            print "removed %s because it matched on %s" % (candidate[0], test)

the output:

  • removed c because it failed on 0110
  • removed d because it failed on 0110
  • removed a because it matched on 1
  • removed b because it matched on 10
Igor
Just because you haven't disproven B, doesn't mean that you have proven B. Nice effort, though, just fallacious logic.
polygenelubricants
oops, my bad. when anchoring the expressions (putting each one between a ^ and a $), the only one that survives is B. of course, you'd still have to prove it...
Igor
I don't think the whitespaces within the regular expressions are supposed to count. You should rerun it with whitespace ignored.
Gabe
+2  A: 

To solve such a problem you should

  1. Supply counterexample patterns to all "incorrect" regexps. This will be either a string in L that is not matched, or a matched string out of L.
  2. To prove the remaining "correct" pattern, you should answer two questions:

    • Does every string that matches the pattern belong to L? This can be done by devising properties each of matched strings should satisfy--for example, number of occurrences of some character...
    • Is every string in L matched by the regexp? This is done by dividing L into easily analyzable subclasses, and showing that each of them matches pattern in its own way.

(No concrete answers due to [homework]).

Pavel Shved
A: 

Examining the pattern B:

^0*(10*10*)*$

^          # match beginning of string
0*         # match zero or more '0'
(          # start group 1
 10*       # match '1' followed by zero or more '0'
 10*       # match '1' followed by zero or more '0'
)*         # end group 1 - match zero or more times
$          # end of string

Its pretty obvious that this pattern will only match strings who have 0,2,4,... 1's.

gnarf