ansaurus

Question

Regular expression to define some binary sequence

Answer 1

A:

I don't think you would. I can't believe in any language using a regular expression could ever be the best way to do this.

Dave Webb 2009-05-15 06:42:50

i know its not the best way. I know it can be done but I just cant figure out how. It involves drawing the automata and eliminating middle states.

Jaelebi 2009-05-15 06:44:53

@Dave Webb, you can definitely do this. Actually, this is a pretty common sort of exercise in a CS Theory course, which is why I'm hesitant to answer this question.

BobbyShaftoe 2009-05-15 06:46:57

do you know how this can be done? any hints?

Jaelebi 2009-05-15 06:49:17

@Dave Webb The answer is (1(01*0)*1)*0*

Jaelebi 2009-05-15 07:57:07

@unknowh yahoo, not *quite* right, that won't work for 17 * 3 = 51 (110011). You need to allow for repetitions at more levels.

paxdiablo 2009-05-15 08:37:06

Answer 2

+1 A:

http://www.cs.princeton.edu/introcs/73fsa/ 'nuff said (First Google hit to the proper question).

Vinko Vrsalovic 2009-05-15 06:58:08

Answer 3

+5 A:

Using the DFA in http://www.cs.princeton.edu/introcs/73fsa/ we can make a regular expression the following way, where A, B, C represent the states of the DFA.

A = 1B + 0A
B = 1A + 0C
C = 1C + 0B

C = 1*0B // Eliminate recursion

B = 1A + 0(1*0B)
B = 01*0B + 1A
B = (01*0)*1A // Eliminate recursion

A = 1(01*0)*1A + 0A
A = (1(01*0)*1 + 0)A
A = (1(01*0)*1 + 0)* // Eliminate recursion

Resulting in a PCRE regex like:

/^(1(01*0)*1|0)+$/

Perl test/example:

use strict;

for(qw(
11
110
1001
1100
1111
0
1
10
111
)){
    print "$_ (", eval "0b$_", ") ";
    print /^(1(01*0)*1|0)+$/? "matched": "didnt match";
    print "\n";
}

Outputs:

11 (3) matched
110 (6) matched
1001 (9) matched
1100 (12) matched
1111 (15) matched
0 (0) matched
1 (1) didnt match
10 (2) didnt match
111 (7) didnt match

Qtax 2009-05-15 08:02:07

+1. Now this is great. I had no idea you could create a regular expression that easy from a DFA.

Lieven 2009-05-15 08:13:28

Answer 4

+3 A:

This has been asked a couple of times so I'm going to try for the definitive answer.

When you divide a number by three, there are only three possible remainders (0, 1 and 2). What you're aiming at is to ensure the remainder is 0, hence a multiple of three.

This can be done by an automata with the three states:

ST0, multiple of 3 (0, 3, 6, 9, ....).
ST1, multiple of 3 plus 1 (1, 4, 7, 10, ...).
ST2, multiple of 3 plus 2 (2, 5, 8, 11, ...).

Now think of any non-negative number (that's our domain) and multiply it by two (tack a binary zero on to it). The transitions for that are:

ST0 -> ST0 (3n * 2 = 3 * 2n, still a multiple of three).
ST1 -> ST2 ((3n+1) * 2 = 3*2n + 2, a multiple plus 2).
ST2 -> ST1 ((3n+2) * 2 = 3*2n + 4 = 3*(2n+1) + 1, a multiple plus 1).

Now think of any non-negative number and multiply it by two then add one (tack a binary one on to it). The transitions for that are:

ST0 -> ST1 (3n * 2 + 1 = 3*2n + 1, a multiple plus 1).
ST1 -> ST0 ((3n+1) * 2 + 1 = 3*2n + 2 + 1 = 3*(2n+1), a multiple).
ST2 -> ST2 ((3n+2) * 2 + 1 = 3*2n + 4 + 1 = 3*(2n+1) + 2, a multiple plus 2).

This idea is that, at the end, you need to finish up in state ST0. However, given that there can be an arbitrary number of sub-expressions (and sub-sub-expressions), it does not lend itslef easily to reduction to a regular expression.

What you have to do is allow for any of the transition sequences that can get from ST0 to ST0 then just repeat them:

These boil down to the two RE sequences:

ST0 --> ST0                                      :  0+
    [0]
ST0 --> ST1 (--> ST2 (--> ST2)* --> ST1)* --> ST0:  1(01*0)*1
    [1]     ([0]     ([1]    )* [0]    )* [1]

or the regex:

(0+|1(01*0)*1)+

paxdiablo 2009-05-15 08:31:53

ansaurus

tags:

views:

answers:

Regular expression to define some binary sequence

related questions