views:

775

answers:

5

I have a table with a heavy load(many inserts/updates/deletes) in a SQL2005 database. I'd like to do some post processing for all these changes in as close to real time as possible(asynchronously so as not to lock the table in any way). I've looked a number of possible solutions but just can't seem to find that one neat solution that feels right.

The kind of post processing is fairly heavy as well, so much so that the windows listener service is actually going to pass the processing over to a number of machines. However this part of the application is already up and running, completetly asynchronous, and not what I need help with - I just wanted to mention this simply because it affects the design decision in that we couldn't just load up some CLR object in the DB to complete the processing.

So, The simple problem remains: data changes in a table, I want to do some processing in c# code on a remote server.

At present we've come up with using a sql trigger, which executes "xp_cmdshell" to lauch an exe which raises an event which the windows service is listening for. This just feels bad.

However, other solutions I've looked at online feel rather convoluted too. For instance setting up SQLCacheDependancy also involves having to setup Service broker. Another possible solution is to use a CLR trigger, which can call a webservice, but this has so many warnings online about it being a bad way to go about it, especially when performance is critical.

Idealy we wouldn't depnd on the table changes but would rather intercept the call inside our application and notify the service from there, unfortunately though we have some legacy applications making changes to the data too, and monitoring the table is the only centralised place at the moment.

Any help would be most appreciated.

Summary:

  • Need to respond to table data changes in real time
  • Performance is critical
  • High volume of traffic is expected
  • Polling and scheduled tasks are not an option(or real time)
  • Implementing service broker too big (but might be only solution?)
  • CLR code is not yet ruled out, but needs to be perfomant if suggested
  • Listener / monitor may be remote machine(likely to be same phyisical network)
A: 

Since you said there're many inserts running on that table, a batch processing could fit better.

Why did just create a scheduled job, which handle new data identified by a flag column, and process data in large chunks?

Rubens Farias
Thanks for the reply. Unfortunately this solution as I undeerstand it would not be in real time, nor would any solution that required a polling mechanism.
Mike Tours
A: 

This could be done in many ways. below method is simple since you dont want to use CLR triggers and sqlcmd options.

  • Instead of using CLR triggers you can create the normal insert trigger which updates the dedicated tracking table on each insert.

  • And develop dedicated window service which actively polls on the tracking table and update the remote service if there is any change in the data and set the status in tracking table to done (so it wont be picked again)..

EDIT:

I think Microsoft sync services for ADO.Net can work for you. Check out the below links. It may help you

Cheers

Ramesh Vel

Ramesh Vel
Thanks for the response. We're really like it to be a notification rather than use polling. Polling has been discussed and ruled out as a solution unfortuntely and isn't real time either. Thanks again
Mike Tours
@Mike, check out my edit.. that may be helpful..
Ramesh Vel
Thanks again. I Had a look at this, but like service broker is a fairly obtuse way to try find a solution. For the most part it's a way to monitor all changes inclduing schema changes. The article also mentions that it has a number of disadvantages, which given my "real time" and "high performance" requirements are going to make this hard to implement.
Mike Tours
A: 

Use the typical trigger to fire a CLR on the database. This CLR will only start a program remotely using the Win32_Process Class:

http://motevich.blogspot.com/2007/11/execute-program-on-remote-computer.html

KM
Thanks for the response. This isn't too disimlar to the approach whereby we use EXEC sp_cmdShell to start a process. Your approach does address the ability to start it remotely, however I still don't feel the approach is the best one ... i.e. data changes, fire trigger, run CLR, start exe (locally or remote), fire an event, windows service listens to event. It just seems a bit mad, but I'm beginning to feel like it's the only way . Thanks again
Mike Tours
at least with this method, it actually starts the processing on the remote machine in real time, it doesn't issue a request that has to wait for some polling to process it. You could remove the clr and use a trigger to call EXEC sp_cmdShell XYZ.exe and XYZ.exe uses the Win32_Process Class to remotely run the exec. I'm afraid that that there isn't much else you can do.
KM
+1  A: 

You really don't have that many ways to detect changes in SQL 2005. You already listed most of them.

Query Notifications. This is the technology that powers SqlDependency and its derivatives, you can read more details on The Mysterious Notification. But QN is designed to invalidate results, not to pro-actively notify change content. You will only know that the table has changes, without knowing what changed. On a busy system this will not work, as the notifications will come pretty much continously.

Log reading. This is what transactional replication uses and is the least intrusive way to detect changes. Unfortunately is only available to internal components. Even if you manage to understand the log format, the problem is that you need support from the engine to mark the log as 'in use' until you read it, or it may be overwritten. Only transactional replication can do this sort of special marking.

Data compare. Rely on timestamp columns to detect changes. Is also pull based, quite intrussive and has problems detecting deletes.

Application Layer. This is the best option in theory, unless there are changes occuring to the data outside the scope of the application, in which case it crumbles. In practice there are always changes occuring outside the scope of the application.

Triggers. Ultimately, this is the only viable option. All change mechanisms based on triggers work the same way, they queue up the change notification to a component that monitors the queue.

There are always suggestions to do a tightly coupled, synchronous notification (via xp_cmdshell, xp_olecreate, CLR, notify with WCF, you name it), but all these schemes fail in practice because they are fundamentally flawed:
- they do not account for transaction consistency and rollbacks
- they introduce availability dependencies (the OLTP system cannot proceed unless the notified component is online)
- they perform horribly as each DML operation has to wait for an RPC call of some form to complete

If the triggers do not actually actively notify the listeners, but only queue up the notifications, there is a problem in monitoring the notifications queue (when I say 'queue', I mean any table that acts as a queue). Monitoring implies pulling for new entries in the queue, which means balancing the frequency of checks correctly with the load of changes, and reacting to load spikes. This is not trivial at all, actually is very difficult. However, there is one statement in SQL server that has the semantics to block, without pulling, until changes become available: WAITFOR(RECEIVE). That means Service Broker. You mentioned SSB several times in your post, but you are, rightfuly so, scared of deploying it because of the big unknown. But the reality is that it is, by far, the best fit for the task you described.

You do not have to deploy a full SSB architecture, where the notificaition is delivered all the way to the remote service (that would require a remote SQL instance anyway, even an Express one). All you need to accomplice is to decouple the moment when the change is detected (the DML trigger) from the moment when the notification is delivered (after the change is commited). For this all you need is a local SSB queue and service. In the trigger you SEND a change notification to the local service. After the original DML transaction commits, the service procedure activates and delivers the notification, using CLR for instance. You can see an example of something similar to this at Asynchronous T-SQL.

If you go down that path there are some tricks you'll need to learn to achieve high troughput and you must understant the concept of ordered delivery of messages in SSB. I reommend you read these links:

About the means to detect changes, SQL 2008 apparently adds new options: Change Data Capture and Change Tracking. I emphasizes 'apparently', since they are not really new technologies. CDC uses log reader and is based on the existing Transactional replication mechanisms. CT uses triggers and is very similar to existing Merge replication mechanisms. They are both intended for occasionally connected systems that need to sync up and hence not appropiate for real-time change notification. They can populate the change tables, but you are left with the task to monitor these tables for changes, which is exactly from where you started.

Remus Rusanu
Many thanks for your comprehensive response. I can clearly tell you understand the problem domain. The possiblity of using SSB is certainly there, and I'll need to take in the extra reading to make sure. Thanks again
Mike Tours
A: 

In similar circumstances we are using CLR trigger that is writing messages to the queue (MSMQ). Service written in C# is monitoring the queue and doing post-processing. In our case it is all done on the same server, but you can send those messages directly to the remote queue, on a different machine, totally bypassing "local listener".

The code called from trigger looks like this:

public static void SendMsmqMessage(string queueName, string data)
{
    //Define the queue path based on the input parameter.
    string QueuePath = String.Format(".\\private$\\{0}", queueName);

    try
    {
        if (!MessageQueue.Exists(QueuePath))
            MessageQueue.Create(QueuePath);

        //Open the queue with the Send access mode
        MessageQueue MSMQueue = new MessageQueue(QueuePath, QueueAccessMode.Send);

        //Define the queue message formatting and create message
        BinaryMessageFormatter MessageFormatter = new BinaryMessageFormatter();
        Message MSMQMessage = new Message(data, MessageFormatter);

        MSMQueue.Send(MSMQMessage);
    }
    catch (Exception x)
    {
        // async logging: gotta return from the trigger ASAP
        System.Threading.ThreadPool.QueueUserWorkItem(new WaitCallback(LogException), x);
    }
}
Dmitry Duginov
Thank you for your response. I think this has the same sort of drawbacs for me as using Service Broker. The listener service we want to talk to was originally written for the purpose of moving awawy from MSMQ in the first place. So it seems like a step backwards to have to implement MSMQ, and again exposes us to the original problem of having to poll the queue for new messages. We might as well poll the Database for changes?But, if this is the only way forward, it is one of the neater solutions and I prefer it to using xp_cmdshell for sure.
Mike Tours