If you look at the call stack of a program and treat each return pointer as a token, what kind of automata is needed to build a recognizer for the valid states of the program?
As a corollary, what kind of automata is needed to build a recognizer for a specific bug state?
(Note: I'm only looking at the info that could be had from this function.)
My thought is that if these form regular languages than some interesting tools could be built around that. E.g. given a set of crash/failure dumps, automatically group them and generate a recognizer to identify new instances of know bugs.
Note: I'm not suggesting this as a diagnostic tool but as a data management tool for turning a pile of crash reports into something more useful.
- "These 54 crashes seem related, as do those 42."
- "These new crashes seem unrelated to anything before date X."
- etc.
It would seem that I've not been clear about what I'm thinking of accomplishing, so here's an example:
Say you have a program that has three bugs in it.
- Two bugs that cause invalid args to be passed to a single function tripping the same sanity check.
- A function that if given a (valid) corner case goes into an infinite recursion.
Also as that when the program crashes (failed assert, uncaught exception, seg-V, stack overflow, etc.) it grabs a stack trace, extracts the call sites on it and ships them to a QA reporting server. (I'm assuming that only that information is extracted because 1, it's easy to get with a one time per project cost and 2, it has a simple, definite meaning that can be used without any special knowledge about the program)
What I'm proposing would be a tool that would attempt to classify incoming reports as connected to one of the known bugs (or as a new bug).
The simplest thing would be to assume that one failure site is one bug, but in the first example, two bugs get detected in the same place. The next easiest thing would be to require the entire stack to match, but again, this doesn't work in cases like the second example where you have multiple pieces of (valid) valid code that can trip the same bug.