I have a replication setup it was up and running yesterday, now I can't connect to the distributor and I don't know why.

  • The publisher is on a Domain and talks to the subscriber (not on the domain) via VPN.
  • I can ping the publisher from the subscriber and vice-versa.
  • I can log in to MSSQL on the publisher from the subscriber.
  • The publisher is in an error state because no activity is being logged from the subscriber.
  • The subscriber can't connect to the distributor.
  • Tracer tokens report a time for the distributor but only pending for the subscriber.
  • Assuming db changes by a third person might be causing troubles, I deleted the subscription and the publication and recreated them.
  • I can confirm that yesterday using the same tables/user accounts/servers etc the replication was working and replicating data.

If you have any suggestions to try to narrow down where the problem is, they would be most appreciated.

Answers to questions:

  • I am using one way transactional replication
  • The distributor is on the same machine as the publisher.
  • Both the publisher and the subscriber are 2003 SP2

Seems to be working again

I still don't know why the original working setup stopped working. But when I re-subscribed I had the wrong credentials for one of the connections. By correcting that problem the replication is working once again.

+2  A: 

Can you execute T-SQL both directions? You say that the subscriber can't connect to the distributer, but can the distributer see both? Is the distributer on your network or the remote network? I'm just wondering if things are batching up on the distributer, and the disconnect is between there and your subscriber, or if the disconnect is publisher -> distributer, especially if all three are on different servers.

Just somewhere to start.

I suspect the disconnect is between the publisher and distributor. The setup I had working previously, when I inserted a tracer token it would make it to the distributor but no further.
+1  A: 

I have never had to debug problems with the log-reader but often had to debug issues with the subscriber. What I would do is run the distribute agent (I remember it being called distrib.exe ... that is because I typed it in so often!) with a heck of a log of debugging flags set (I just looked up the docs on BOL and see an argument OutputVerboseLevel. I don't remember that argument name directly but it looks plausible).

You can figure out what kind of arguments to provide to the agent by looking up the job definition that was set up by the replication wizard .. one of the steps should be a 'run an external command' and that will have a bunch of the argument names and values.

Once I had figured out the appropriate invocation, I saved it in a batch file and would then run it from the DOS prompt whenever there was a problem that was not debuggable from the GUI.


Paul Harrington