views:

83

answers:

1

I'm trying to find a way to 'throttle' CDC on SQL2008.

The reason being that under normal circumstances, CDC performs brilliantly, but as soon as it needs to deal with a 'large' number of rows, it starts tanking.

Typical throughput is between 1000 and 3000 rows a second. It starts to die at about 5000 rows per second.

Usually, this is not an issue, since we're using CDC to keep two databases in sync as a near real-time ETL process for statistical modelling. In the past, for bulk data moves we've had to come up with dodgy manual methods. I'm wondering if I can through a huge amount of data at it, but find a way to tell it to only do 5 transactions at a time, or otherwise force it to work through bite-sized chunks (however long that takes), rather than try and do them all at once and suffer poorly.

A: 

Please see: Tuning the Performance of Change Data Capture in SQL Server 2008

Are you sure that CDC is the right solution for what you are trying to achieve? I'm just wondering if SQL Server Change Tracking using ADO.NET Sync services might be more appropriate?

Mitch Wheat
I've read through that. Changing the values didn't affect the 'tank' point.I think this is because those values control the flow of data into the change tables. My issue is more on the flow of data out of those same tables.I'm thinking about using the __$start_lsn column, and then some kind of rowcount operation to determine how big those transactions are (each __$start_lsn should be analogous to a 'transaction' for all intents and purposes). I'll then know how many transactions, or lsn's I can process in a batch.But I'm kinda struggling with the concepts!
divv
I kinda made it a bit simplistic in my question. We do a little more than merely sync. We're using CDC so we get the historical view of how the data changed over time (which is important to our statistics modelling process).
divv
You might want to ask a more detailed question, and someone might have ideas.
Mitch Wheat