Replication - syncronizing most of the data some of the time

I have some data that isn't properly "partitioned" (for lack of a better word).

All inserts, processing and reporting happen on the same table. The bulk of the processing happens not long after the insert and not long after that it becomes immutable (we're talking days).

I could do all inserts and processing on a new table that I replicate to the old table. When I detect that the data has become immutable I would delete the data from the new table, but I would edit the delete replication stored procedure so that the delete did not replicate.

How bad an idea is this? <edit1>That is, editing the replication stored procedure.</edit1>

It seems attractive at the moment (I haven't slept on it yet) because it might mitigate a performance problem with only very small changes to the application. It also seems like it might be a good way to shoot myself in the foot.

Edit1:

I like the idea of inserting into two tables because I can avoid the view and the maintenance window described in Jono's answer. No offense, Jono, I actually use this technique elsewhere.

I might want to use replication because one table might be in another database (I know, I didn't mention this) and that way I don't have to worry about committing to two tables, I just let replication handle that.

My actual concern (that I didn't make clear) is that editing the replication stored procedure could end up being a deployment/maintenance headache.

I wouldn't advocate replication to solve a performance issue (unless it's a problem of physical data distribution); if anything it's going to slow your system down as the changes are propagated to their destination. If you're using a single server, I'd suggest adding a second table with the same schema as the first, but with your indexes optimised for the kind of work you do in your processing phase. Then create a view that selects from both tables, and use that view in any query where you want the union of both tables. You could then throw more hardware at the second table (I'm thinking of a separate file group over more spindles) and then migrate the data on a weekly delay into the first table, during an available maintenance window.

I may be able to toss one of those tables onto another server, which is why I was thinking that replication might be part of my solution. Thanks for reading through my poorly worded question.

uncle brad 2010-04-27 23:04:50

Rather than edit the procedure, why not code some logic into it that it checks some shared state (a row in a table of your choosing with the values {"is_replication_enabled", 0}) before it attempts to ship the rows off from source to destination. Your delete procedure could set this value to 0 when it starts, and back to 1 when it finishes.

Jono 2010-04-28 08:28:32

ansaurus

tags:

views:

answers:

Replication - syncronizing most of the data some of the time

related questions