I have transactional replication running between two servers over a dedicated VPN connection. The databases are fairly large, so I initially use the backup and restore method to get the initial snapshot over to the subscriber machine and then let it apply the incremental transactions from there.
Everything runs fine until the VPN line gets flaky (which it does occassionally) at which point the replication process is prone to locking up. When I look on the subscriber side, there are a few SQL processes which appear to be hung and have locks held on the subscriber database and tables. The crazy thing is that those processes are coming from the replication service. I can assure you (from trial and error) that no other processes are locking this database except for replication itself.
So why would the replication process trip over its own feet like that? Why would it get hung just because of a loss of network connectivity? Any suggestions for somehow making it more reliable?