views:

1268

answers:

7

I'm implementing a DFA as close as I can to the formal definition as a learning exercise (and blogging material)

I planned on using a java.util.Set where a set is involved in the definition.

The definition involves a set of tuples to define the legal state transitions: (state,symbol) -> nextState.

I have a Transition class with members state, symbol and nextState. I have implemented equals() and hashCode() to indicate that two Transitions are equal if they match on state and symbol. I then have a java.util.Set of Transition instances.

In my processing algorithm, I have the current state when I read the next symbol. I anticipated building a Transition object using these two to pull out the matching Transition from the Set, which would then tell me the next state, and I can iterate.

But - I don't see any way of extracting a member of a java.util.Set for further use. I can remove(Object o), but that just return boolean.

What am I doing wrong?

+2  A: 

Set is probably not what you want to use for this. My recommendation would be to use a List< Transition>, or possibly a Map< State,List< Transition>>. I'm not sure which would be better without actually building it and doing some benchmarking.

Matthew Brubaker
It's not about performance or anything, it's just a simple easy to understand implementation. I like the Map thought - the definition says that there is a transition function, not a set of transition functions - so I think it would be in the spirit...
Brabster
I'm thinking Map<<Transition>, State> where state is the next state, actually - let me give that a go.
Brabster
Transition/State would also a good choice for the Map entries. =)
Matthew Brubaker
+1  A: 

It sounds as though your overriding of equals() and hashCode() is dubious, because the original transition matches the one in the set according to equals() and yet the two are not interchangeable (otherwise you would just use the new transition in place of the original.)

You probably want a class that is just a combination of state and symbol with no other attributes, and use that as a key in a Map. Alternatively you could use a Map<State, Map<Symbol, State>>

finnw
+1  A: 

I agree with Matthew Brubaker that Set is probably not what you need. You might want to try enums instead; see the Java Glossary for an example.

Michael Myers
+1  A: 

Can't you accomplish this without having some external collection of states, or even the Transistion objects? If the State class is defined like:

public class State
{
    private Map<Symbol, State> transitions = new HashMap<Symbol, State>();

    public State() { /* populate transitions somehow */ }

    public State nextState(Symbol symbol) { return transitions.get(symbol); }
}

Then, if you have a reference to the initial state, you can just move from state to state like this:

State initial = getInitialStateSomehow();
State second = initial.nextState(SYMBOL_1);
State third = initial.nextState(SYMBOL_2);  // etc...
Outlaw Programmer
A: 
Chad Okere
You are thinking deterministic finite state machine! The OP probably meant a non-deterministic finite state machine where using sets for the current states and the accepted states makes a lot more sense.
Konrad Rudolph
He specifically said "DFA", not "NFA". It would make more sense if he was talking about an NFA, though.
Chad Okere
A: 

If you just want to implement a pattern matching engine, a State Design Pattern might be unnecessary since the patter is unlikely to change. As Chad has pointed out, using a switch to encode the transition function is completely acceptable in such cases.

Here's an example of a nondeterministic pattern matching automaton that uses sets:

public boolean validate() {
    Set<Integer> currentStates = new HashSet<Integer>();
    final Set<Integer> acceptingStates = new HashSet<Integer>();

    currentStates.add(0); // Initial state.
    acceptingStates.add(1);
    acceptingStates.add(3);
    acceptingStates.add(6);

    for (int i = 0; i < getInput().length(); i++) {
        char c = getInput().charAt(i);
        Set<Integer> newStates = new HashSet<Integer>();

        for (int state : currentStates) {
            switch (state) {
                case 0:
                    if (c == 'a')
                        newStates.add(1);
                    break;
                case 1:
                    if (c == 'b') {
                        newStates.add(2);
                        newStates.add(4);
                    }
                    break;
                case 2:
                    if (c == 'b')
                        newStates.add(3);
                    break;
                case 3:
                    if (c == 'b')
                        newStates.add(2);
                    break;
                case 4:
                    if (c == 'b')
                        newStates.add(5);
                    break;
                case 5:
                    if (c == 'b')
                        newStates.add(6);
                    break;
                case 6:
                    if (c == 'b')
                        newStates.add(4);
                    break;
            }
        }

        if (newStates.size() == 0)
            return false;
        currentStates = newStates;

        System.out.printf("Character read: %c\n", c);
        System.out.printf("Current states: ");
        printStates(currentStates);
    }

    for (int state : acceptingStates)
        if (currentStates.contains(state))
            return true;
    return false;
}

This automaton recognizes input words of the regular language described by the pattern "a(bb*|bbb*)", i.e. an “a” followed by either a multiple of two or a multiple of three many “b”s.

Konrad Rudolph
A: 

Nice code, however you have to provide all the code.