views:

184

answers:

2

We have a particularly bizarre problem; let me set the scene. Solution found see below

We have three SQL Server 2005 databases, for the sake of argument called: Alpha, Beta and Gamma.

There is a replication relationship defined between these databases as follows:

All three databases have a table named "AnExample" with the same schema. Replication is set up so that Alpha is the provider and the other two databases are subscribers.

  1. A c# .Net 3.5 application using a MSDTC transaction (handled by a TransactionScope) is both reading and writing to databases: Alpha and Beta.
  2. The table "AnExample" within this transaction is being updated only on Alpha.
  3. The MSDTC transaction successfully commits.
  4. "AnExample" table is provably updated in Alpha and the change is immediately replicated into Gamma
  5. No change occurs on Beta (profiler confirms that no activity happens on the database), nor is any error raised in SQL logs or event logs
  6. Re-running the same query that updates "AnExample" in Management Studio with the same credentials is successful (replication to Beta occurs)
  7. Running a MSDTC transactioned Write to another table on Beta, then the exact same write to Alpha's "AnExample" table with a test application using the main applications DAL, connection strings and configuration also fully succeeds (replication to Beta occurs)

This has led us to believe that there is some change occurring in the main application that does not occur in isolation.

Possible clues / red herrings

The only difference we can see between our successful tests and the actual query that the main application uses is that the isolation level has somehow changed. On the successful queries it is set transaction isolation level Read Committed only, whereas it is set to serializable in the failing scenarios (despite no explicit call to change the isolation level being in the codebase or stored procedures).

We do feel that this is somewhat of a red herring as running the query with this isolation level in Management Studio again succeeds without issue. But the fact that it is different points that this might be a symptom of another issue that we have not yet discovered.

For completeness here are the settings for the query that fails to replicate to Beta (but does to Gamma).

-- network protocol: TCP/IP
set quoted_identifier on
set arithabort off
set numeric_roundabort off
set ansi_warnings on
set ansi_padding on
set ansi_nulls on
set concat_null_yields_null on
set cursor_close_on_commit off
set implicit_transactions off
set language us_english
set dateformat mdy
set datefirst 7
set transaction isolation level *serializable*

It's a bit of a headscratcher.

A: 

I assume this is transaction replication? If so, the changes to push to the subscribers are harvested from the transaction log by the log reader, so the problem does indeed sound strange.

I would try running a trace on the publisher db so you can watch what the replication agents do.. not sure what might be wrong, but maybe something will jump out at you.

To your first question yes it was transaction replication.Although we solved this - it's worth mentioning that we had used profiler on all the DB's involved and saw no activity associated with the failed replication - it was as if it never even tried.
Lex
A: 

So we found what appears to be the issue. As described above; we have a distributed transaction that writes to Alpha then Beta. The write to Beta is then successfully replicated to Gamma it fails silently to Alpha. However there was one omission that one of the writes to Alpha is actually being replicated to Beta (the trouble of a large database schema). Moving this write to Beta meant that suddenly replication succeeded.

I've not seen it documented that updates via distributed transactions cannot replicate more that one way but to be fair it is a somewhat obscure problem that was simply solved by ensuring all writes on replicated tables happened on the same database.

Hope this helps somebody else.

Lex