How can I minimize the data in a SQL replication

views:

180

answers:

+4 Q:

How can I minimize the data in a SQL replication

I want to replicate data from a boat offshore to an onshore site. The connection is some times via a sattelite link and can be slow and have a high latency.

Latency in our application is important, the people on-shore should have the data as soon as possible.

There is one table being replicated, consisting of an id, datetime and some binary data that may vary in length, usually < 50 bytes.

An application off-shore pushes data (hardware measurements) into the table constantly and we want these data on-shore as fast as possible.

Are there any tricks in MS SQL Server 2008 that can help to decrease the bandwith usage and decrease the latency? Initial testing uses a bandwidth of 100 kB/s.

Our alternative is to roll our own data transfer and initial prototyping here uses a bandwith of 10 kB/s (while transfering the same data in the same timespan). This is without any reliability and integrity checks so this number is artificially low.

I'd suggest on the fly compression/decompression outside of SQL Server. That is, SQL replicates the data normally but something in the network stack compresses so it's much smaller and bandwidth efficient.

I don't know of anything but I'm sure these exist.

Don't mess around with the SQL files directly. That's madness if not impossible.

gbn 2009-05-26 11:34:53

Im pulling my answer :) Not sure the 'not used MSSQL much' is enough warning to stop offshore 'Why is it corrupted' telephone calls later :P

Aiden Bell 2009-05-26 12:32:14

Do you expect it to always be only one table that is replicated? Are there many updates, or just inserts? The replication is implemented by calling an insert/update sproc on the destination for each changed row. One cheap optimization is to force the sproc name to be small. By default it is composed from the table name, but IIRC you can force a different sproc name for the article. Given an insert of around 58 bytes for a row, saving 5 or 10 characters in the sproc name is significant.

I would guess that if you update the binary field it is typically a whole replacement? If that is incorrect and you might change a small portion, you could roll your own diff patching mechanism. Perhaps a second table that contains a time series of byte changes to the originals. Sounds like a pain, but could have huge savings of bandwidth changes depending on your workload.

Are the inserts generally done in logical batches? If so, you could store a batch of inserts as one customized blob in a replicated table, and have a secondary process that unpacks them into the final table you want to work with. This would reduce the overhead of these small rows flowing through replication.

2009-05-26 16:53:36

It's just the one table and just inserts. The inserts are continously and I'd rather not store them and then batch-ship them since that would probably increase the latency, the opposite of what I'm trying to do.

henriksen 2009-05-27 07:09:45

+1 A:

You can try out different replication profiles or create your own. Different profiles are optimized for different network/bandwidth scenarios.

MSDN talks about replication profiles here.

Matt Spradley 2009-05-27 16:42:37

+1 A:

Have you considered getting a WAN accelerator appliance? I'm too new here to post a link, but there are several available.

Essentially, the appliance on the sending end compresses the outgoing data, and the receiving end decompresses it, all on the fly and completely invisibly. This has the benefit of increasing the apparent speed of the traffic and not requiring you to change your server configurations. It should be entirely transparent.

Matt Simmons 2009-05-29 04:28:19

ansaurus

tags:

views:

answers:

How can I minimize the data in a SQL replication

related questions