( Of course I can state this is "mysterious" because I have not been able to identify the problem. I am hoping it will be obvious to one of you erudite readers and that you can enlighten me :-)
Running a single fitnesse test in my browser (FireFox or IE) works fine, but when I run a suite of tests or a suite of suites, Fitnesse just stops somewhere very soon after beginning. It never reports test completion; it just hangs.
I am running fitnesse on Windows XP against a .NET 3.5 code base. I first attacked the problem by instrumenting both the fitnesse tests and the fitnesse fixtures with diagnostic code to attempt to determine whether it was indeed fitnesse locking up or (more likely) my code base accessed by the fixtures. So I created some diagnostic routines that write to a log file to tell me when entering and leaving each fitnesse fixture. If the log file last reports an "enter" that points to getting stuck in the code base; if the log last reports a "leave" that points to fitnesse. The diagnostics are quite simple, requiring each fixture to be manually instrumented--observe the Diagnostic.Enter
and Diagnostic.Leave
methods in the skeleton code below. (The argument to the Leave
method lets me see the text of an exception, if one occurs.)
public class AddFoobarEntityFixture : ColumnFixture
{
public bool Ok()
{
Diagnostic.Enter();
string exitMessage = null;
try
{
. . .
}
catch (Exception exc)
{
exitMessage = exc.Message;
return false;
}
finally
{
Diagnostic.Leave(exitMessage);
}
return true;
}
}
After running a series of trials for the same test suite I noticed a couple startling observations:
Run from a browser, the fitnesse output lags the test progression and (in this lockup scenario) never catches up. That is, in the browser I see anywhere from one to perhaps a dozen test tables executed. The log file, on the other hand, has shown up to about 35 test tables for the same executions. I suspect this lag is unrelated to the lockup because the web page stops updating long before the lockup occurs, where the log file continues to report test tables being executed.
The lockup occurs at random locations. My rough bar chart below shows almost a dozen trials (one per row), with time (or number of test tables) on the horizontal axis. Each "X" represents one test table processed.
1> XXXXXXXXXXXXXXXXXXXXXXX
2> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
3> XXXXXXXXXXXXXXXXXXXXXX
4> XXXXXXXXXXXXXXXXXXXXXXX
5> XXXXXXXXXXXXXXXXXXXXXXXX
6> XXXXXXXXXXXXXXXXXXXXXXX
7> XXXXXXXXXXXXXXXXXXXXXXX
8> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
9> XXXXXXXXXXXXXXXXXXXXXXXX
A> XXXXXXXXXXXXXXXXXXXXXXXX
B> XXXXXXXXXXXXXXXXXXXXXXXXWithout exception, every Enter was balanced by a Leave in the log. This suggests that the problem is with fitnesse rather than the code under test. It does rely on two important assumptions, however: first that every test fixture is instrumented, and second, that within each instrumented test fixture only trivial code is outside the Enter-Leave bracketing (e.g. things like a return statement returning only a local value or a variable declaration with a simple or no initialized value). I have not fully vetted these two assumptions but I think they will prove to be OK.
I had hoped that fitnesse provided its own logging so I could see, for example, which SetUp or SuiteSetUp was inherited, when includes were processed, which test table was being run, etc. From what I have seen, however, the only logging capability of fitnesse reports to the granularity of an entire test page, which is unfortunate.
Curiously, my own web searches have turned up absolutely no mention of others encountering this problem with fitnesse which, of course, strongly intimates the problem lies in my code base somehow.
Any suggestions to isolating this problem, be it in fitnesse or in my code base, are appreciated!
2010.07.15 Update
Strangely enough, I think I fixed the problem. By changing the port that fitnesse uses from (what I think is) the default of 8080 to a less popular port number, now I can run suites of tests or suites of suites without problem. I did check that I had nothing else running on port 8080 (with TCPView). So anyone have any thoughts on why this would make any difference?