ansaurus

Question

Speed of finite state machines - OO vs procedural

Answer 1

+3 A:

I would prefer well ~~structured~~ organized, maintainable, comprehendable (even after leaving for a month, thats quite a quality) code instead of a "bunch of labels and goto statements." Being said that, I would also run a profiler on the code to detect the bottlenecks when doing a speed analysis.

ArunSaha 2010-09-30 04:29:06

Depends. If by "well structured" you mean "without goto's", I'd use goto's if appropriate. If you need a "JUMP" it cannot be replaced by a "CALL".

ruslik 2010-09-30 04:41:57

@ruslik: Godo one :-), but no, I did not meaned that. I meant "well organized."

ArunSaha 2010-09-30 04:49:28

Answer 2

+3 A:

Rather than thinking about it in terms of OO being slower than functional or procedural programming think about it in terms of operations. Calling a function, branching, fetching, storing etc... all take some time and to get an idea of the performance of each solution you'd need to tally up how much of each of these you'll need to do.

The best way to do this is to use your OO test solution and see if it's fast enough. If not then profile it, work out which branches or stores are costing you the most and see if you can avoid or streamline them. Slowly re-architect your solution until it meets the performance goals you're after. It may be that in some cases you'll have to adopt a more functional or procedural style.

Lastly if you do code up a single function that consists of hundreds of stack variables and goto you've done it wrong. Code must always be readable and maintainable.

Extra thought: Can you just spend another $5,000 on better hardware to avoid $50,000 in development costs optimising?

Daniel 2010-09-30 04:34:23

Although I do understand what you mean, this program will not run on one machine, it is for people to download and run on their home computers. So for most people, no, they will not spend any money to buy better hardware to help my program run faster. Also this program is in a market niche where it must compete with other similar products, and faster is better. So the speed goal is "as fast as we can bloody make it" and by definition, the program will never be "fast enough". That said, profiling will probably be the final word on it. Thanks.

SuperSoaper 2010-09-30 04:43:00

Answer 3

+1 A:

Simply put, the processor is faster at doing table lookups upon table lookups than branches. What you describe as the OO style will be faster, as long as the table is tractable and small enough to be fit into cache. (I wouldn't certainly say that either paradigm is associated with either implementation, though.)

To be very moderately more technical: The processor has 32-64 K of L1 cache and a few to a couple dozen megabytes of L2-L3. The branch caches amount to a few kilobytes at most.

Potatoswatter 2010-09-30 04:52:48

This is actually the answer I am looking for. In other notes, a simple test program that runs the machine 1,000,000 times was profiled with gprof. gprof says all the functions together took 0.1 time. The function that took the most time out of all the child functions of the machine is the one that determines whether a state is acceptable as the next state by looping through an array of acceptable previous states and comparing the previous state with each one. This sounds like I could perhaps use bit packing to get rid of the loop.

SuperSoaper 2010-09-30 05:17:51

@Super: Well, if you like the answer, you can click the up arrow at top left, and if it's the best one and resolves the question, click the checkmark. Welcome to SO!

Potatoswatter 2010-09-30 07:57:00

@SuperSoaper: There are better ways to profile than gprof. http://stackoverflow.com/questions/1777556/alternatives-to-gprof/1779343#1779343

Mike Dunlavey 2010-09-30 14:24:35

@Super: I didn't read that closely enough. If you get through a million runs of the program in 0.1 seconds, then you aren't outstripping any cache. Either you don't need to optimize at all, or you need a larger testcase to expose the actual behavior of the program.

Potatoswatter 2010-09-30 15:02:53

Answer 4

+1 A:

How often does the state machine change? If it stays constant for a while, what I do is translate it into a hard-coded routine in whatever language I prefer.

Where does the transition diagram of the state machine come from? Does it come from a regular expression? You know you can translate that directly into structured code without first building an arc-set. In fact, it might just be easier to write that code to begin with.

Then I either just make that part of the app, or I dynamically compile and link it into a dll and dynamically load it. The latter takes just a couple seconds.

Finite state machines are so simple that they should basically run in a small fraction of the time it takes to load the I/O buffer. They shouldn't be doing any unnecessary memory allocation, data structure building, any of that OO hoo-haw.

Remember, a finite state machine represented as a set of states and arc tuples is a theoretical model. It exists, like Turing machines, in order to prove theorems about it. Just like a Turing machine, it is not necessarily a good implementation technique in code.

Just to show you what I mean, consider a regular expression for decimal integers:

dd*

where 'd' means digit. As a finite-state machine (tuples) it would be:

A d -> B
B d -> B

as code it would be:

char c = getc();
if (DIGIT(c)){
  c = getc();
  while(DIGIT(c)){
    c = getc();
  }
}

See the similarity of this code with the regular expression? Don't think of c = getc() as "get the next character c". Think of it as "accept the current character c". I hope you can see the code can't really get much faster than this.

If you're really starting from an arc-set, you could generate this:

c = getc();

A:
if (DIGIT(c)){c = getc(); goto B;}
goto END;

B:
if (DIGIT(c)){c = getc(); goto B;}
goto END;

END:

which is spaghetti code, but not any more than the arc-set is, and it will be just as fast as the structured code. (In fact, that's more or less what the compiler converts your structured code into.)

Mike Dunlavey 2010-09-30 14:02:59

ansaurus

tags:

views:

answers:

Speed of finite state machines - OO vs procedural

related questions