I'm currently working on some .Net based software (.Net Framework 3.5 SP1) that integrates with HP Quality Center 10.0 through it's COM Client API (often referred to as TDApiOle80 or TDApiOle80.TDConnection).
We are using XUnit 1.6.1.1521 and Gallio 3.1.397.0 (invoked from an msbuild file)
We go through a process of:
- Creating a connection
- Running a test
- Closing connection
- Disposing
- Forcing a GC.Collection() / GC.AwaitingPendingFinalizers()
For each integration test - and each integration test is run with a timeout configured in it's Fact.
The problem we have is that it appears after a few tests (say about 10 or so) Quality Center blocks indefinitely when called - and the whole of Gallio freezes and will no longer respond.
Originally we discovered that xunit.net only applied it's timeout to the code within the fact - so it would wait indefinitely for the constructor or dispose methods to complete - so we moved that logic into the body of the tests just to confirm... but this has not solved the problem (will still hang after runnin a certain number of tests).
The same thing happens when using TestDriven.Net - can run 1 or a few tests interactively, but more then about 10 tests and the whole run freezes - and our only choice is to kill the ProcessInvocation86.exe process used by TD.Net.
Does anyone have any tips/tricks on either how to stop this happening all together, or to at least insulate my integration tests from these kinds of problems - so that the tests where the QC API blocks indefinitely, the test will fail with a timeout and allow Gallio to move to the next test.
Update
The hint towards using an STA thread has helped move the issue forward a bit - via a custom XUnit.Net attribute we now launch the test in it's own STA thread. This has stopped Gallio/TestDriven.Net from locking up entirely, so we can include running the integration tests on our hudson build server.
public class StaThreadFactAttribute : FactAttribute
{
const int DefaultTime = 30000; // 30 seconds
public StaThreadFactAttribute()
{
Timeout = DefaultTime;
}
protected override System.Collections.Generic.IEnumerable<Xunit.Sdk.ITestCommand> EnumerateTestCommands(Xunit.Sdk.IMethodInfo method)
{
int timeout = Timeout;
Timeout = 0;
var commands = base.EnumerateTestCommands(method).ToList();
Timeout = timeout;
return commands.Select(command => new StaThreadTimeoutCommand(command, Timeout, method)).Cast<ITestCommand>();
}
}
public class StaThreadTimeoutCommand : DelegatingTestCommand
{
readonly int _timeout;
readonly IMethodInfo _testMethod;
public StaThreadTimeoutCommand(ITestCommand innerComand, int timeout, IMethodInfo testMethod)
: base(innerComand)
{
_timeout = timeout;
_testMethod = testMethod;
}
public override MethodResult Execute(object testClass)
{
MethodResult result = null;
ThreadStart work = delegate
{
try
{
result = InnerCommand.Execute(testClass);
var disposable = testClass as IDisposable;
if (disposable != null) disposable.Dispose();
}
catch (Exception ex)
{
result = new FailedResult(_testMethod, ex, this.DisplayName);
}
};
var thread = new Thread(work);
thread.SetApartmentState(ApartmentState.STA); //Set the thread to STA
thread.Start();
if (!thread.Join(_timeout))
{
return new FailedResult(_testMethod, new Xunit.Sdk.TimeoutException((long)_timeout), base.DisplayName);
}
return result;
}
}
Instead we now see output like this when running the tests with TestDriven.Net - incidentally running the same suite a few times will either result in all tests passing, or normally just 1 or two of the tests failing. And after the first failure, the second failure results in this "Error while unloading appdomain" issue.
Test 'IntegrationTests.Execute_Test1' failed: Test execution time exceeded: 30000ms
Test 'T:IntegrationTests.Execute_Test2' failed: Error while unloading appdomain. (Exception from HRESULT: 0x80131015) System.CannotUnloadAppDomainException: Error while unloading appdomain. (Exception from HRESULT: 0x80131015) at System.AppDomain.Unload(AppDomain domain) at Xunit.ExecutorWrapper.Dispose() at Xunit.Runner.TdNet.TdNetRunner.TestDriven.Framework.ITestRunner.RunMember(ITestListener listener, Assembly assembly, MemberInfo member) at TestDriven.TestRunner.AdaptorTestRunner.Run(ITestListener testListener, ITraceListener traceListener, String assemblyPath, String testPath) at TestDriven.TestRunner.ThreadTestRunner.Runner.Run()
4 passed, 2 failed, 0 skipped, took 50.42 seconds (xunit).
I'm still yet to establish why the Quality Center API is hanging indefinitely at random - will investigate this further shortly.
Update 27/07/2010
I've finally established the cause of the hanging - here's the problematic code:
connection = new TDConnection();
connection.InitConnectionEx(credentials.Host);
connection.Login(credentials.User, credentials.Password);
connection.Connect(credentials.Domain, credentials.Project);
connection.ConnectProjectEx(credentials.Domain, credentials.Project, credentials.User, credentials.Password);
It appears that calling Connect followed by ConnectProjectEx has a chance of blocking (but it's non-deterministic). Removing the redundant connection calls seems to have increased the stability of the testing dramatically - correct connection code:
connection = new TDConnection();
connection.InitConnectionEx(credentials.Host);
connection.ConnectProjectEx(credentials.Domain, credentials.Project, credentials.User, credentials.Password);
Having inherited the codebase I didn't give the connection code much thought.
One thing I have yet to figure out is why even with the timeout code included above, the Thread.Join(timeout) never returns. You can attach a debugger and it just shows the test thread is in a joining/wait operation. Perhaps something do with executing in an STA thread?