views:

756

answers:

5

I see this on our stage system, after it has been up for 2-3 days.

"The server failed to resume the transaction. Desc:39000000ef." (with desc:xxx increasing every time).

The stack trace shows

System.Data.SqlClient.SqlException: The server failed to resume the transaction. Desc:39000000ef.
at System.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection)
at System.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj)
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
at System.Data.SqlClient.SqlDataReader.ConsumeMetaData()
at System.Data.SqlClient.SqlDataReader.get_MetaData()
at System.Data.SqlClient.SqlCommand.FinishExecuteReader(SqlDataReader ds, RunBehavior runBehavior, String resetOptionsString)
at System.Data.SqlClient.SqlCommand.RunExecuteReaderTds(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, Boolean async)
at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method, DbAsyncResult result)
at System.Data.SqlClient.SqlCommand.RunExecuteReader(CommandBehavior cmdBehavior, RunBehavior runBehavior, Boolean returnStream, String method)
at System.Data.SqlClient.SqlCommand.ExecuteReader(CommandBehavior behavior, String method)
at System.Data.SqlClient.SqlCommand.ExecuteDbDataReader(CommandBehavior behavior)
at System.Data.Common.DbCommand.ExecuteReader()
at System.Data.Linq.SqlClient.SqlProvider.Execute(Expression query, QueryInfo queryInfo, IObjectReaderFactory factory, Object[] parentArgs, Object[] userArgs, ICompiledSubQuery[] subQueries, Object lastResult)
at System.Data.Linq.SqlClient.SqlProvider.ExecuteAll(Expression query, QueryInfo[] queryInfos, IObjectReaderFactory factory, Object[] userArguments, ICompiledSubQuery[] subQueries)
at System.Data.Linq.SqlClient.SqlProvider.System.Data.Linq.Provider.IProvider.Execute(Expression query)
at System.Data.Linq.DataQuery`1.System.Collections.Generic.IEnumerable<T>.GetEnumerator()
at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)
at ...

After this has happened the entire server goes to heck. It seems like the metadata is no longer correctly represented in memory, because I see many InvalidCastExceptions afterwards. I'm guessing this is due to the system trying to deserialize a string into an int field because the metadata is offset incorrectly.

i.e.

System.InvalidCastException: Specified cast is not valid.
at System.Data.SqlClient.SqlBuffer.get_Int32()
at System.Data.SqlClient.SqlDataReader.GetInt32(Int32 i)
at Read_Order(ObjectMaterializer`1 )
at System.Data.Linq.SqlClient.ObjectReaderCompiler.ObjectReader`2.MoveNext()
at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
at System.Linq.Enumerable.ToList[TSource](IEnumerable`1 source)

What does this mean?

I'm fairly sure the database wasn't updated behind the systems back, and the database was online the entire time.

The problem persists from here on out, until the server is restarted, after which everything works nicely.

Do I need to have code that re-establishes database connection if it fails? Doesn't the framework handle this by itself?

I'm using Sql server 2008, IIS 6, and .Net 3.5SP1

UPDATE: The code structure does something like this:

var dc = new datacontext()
IList<someobject> objs = dc.GetAllItemsToProcess()
var dc2 = new datacontext();
 foreach( var o in objs ) {
    try {
         var o2 = dc2.someobjects.SingleOrDefault(x=>x.id = o.id);
          // do stuff to o2
         dc2.save();
   } catch() {
          // something failed so restart datacontext()
         dc2 = new datacontext();
    }
}
A: 

This is likely not a problem with your code. It is a bug in SQL Server. They had a similar problem in SQL Server 2005. It only happened under conditions that were just right, so very few people ever saw it and those that did were very confused.

Having said that, here are some things to check that have worked for others with the same problem:

  • Look for DataReaders that aren't closed. Make sure you're doing myReader.Close() after you read the rows you want. Many people just rock on without closing.
  • Use the native SqlTransaction class instead of OleDbTransactions wherever possible.
  • Look at your transactions. Make sure you're committing/rolling back cleanly before you close your Connection.
  • Use Connection.BeginTransation rather than Connection.BeginDbTransaction
Rap
Thank you for your reply.We have 0 Datareaders, sql transactions or other handcoded stuff. Everything is done via the datacontext autogenerated from the dbml file, and every access is via linq.So I have no beginTransactions, or connection.closes, because I never handle them directly. Its all in the (microsoft) framework
Soraz
What's with the downvotes? How about some comments? @Soraz didn't say anything about Linq-to-SQL until after my answer.
Rap
A: 

Do you use DataContext object for atomic units of work ? this will help alot with the connection issues if exist.

also sometimes you need to check for a physical damage in RAM for example.

Ahmed Khalaf
+1  A: 
  • Don't share the same DataContext among different operations. Use the same DataContext to group the set of operations that you will be submitting. If you have another set that you will be submitting separately use a separate DataContext.
  • I am not sure, but also check if you are enumerating over 2 IQueryables at the same time. If that's the case, try using a separate DataContext for the second enumeration to make sure it isn't caused by linq2sql trying to open to readers at the same time.
  • Make sure your model is up to date with your DB. The structure define in the Linq2sql designer isn't updated automatically, so when you add a field you need to make sure that's added to the designer (perhaps by re-adding the table if you haven't renamed any fields).
eglasius
The model is updated in the code at least, because the site runs perfectly for 2 days before something happens.I've appended the OT with example of what the code does at the failing point.
Soraz
A: 

Just an update:

  • I tracked the bug to our use of Linq2Sql-entities passed as datasources directly to Report Server.

I.e. we were doing something like this:

List<Order> orderList = orderRepository.getOrders();
ReportViewer rv = new ReportViewer();
rv.LocalReport.DataSources.Add("OrderList", orderList);
rv.Render();

For some reason this made the datacontext confused, even when running in the same app-domain.

Soraz
A: 

So I like Rap's answer that it's a rare bug.

We're seeing this error about once every month. Because we use pooled connections, once a user gets this error message, it seems impervious to sp_reset_connection, and the next user of the connection will also get this message. Theoretically sp_reset_connection is supposed to clean up any transactions, variables, temp tables, etc. left open before passing it along.

Having said that, we do have some DataReaders that aren't closed, but we do not do any transactions at the .NET level. We can definitely clean up some unclosed data readers, but it's tough to see that the app has very high uptime with high usage, but then this error happens until the object engine is restarted or the connection times out of the pool.

We aren't using Linq. There are some tx's at the stored procedure level, and possibly one isn't rolling back properly in a certain error condition. Still, I refer back to the sp_reset_connection comment. Thanks in advance for any advice. -Rob

Rob S