tags:

views:

101

answers:

5

If I want to conduct some database operations on a scheduled basis, I could:

  • Use SQL Server Agent (if SQL Server) to periodically call the stored procedure and/or execute the T-SQL

  • Run some external process (scheduled by the operating system's task scheduler for example) which executes the database operation

  • etc.

Two questions:

  1. What are some other means of accomplishing this
  2. What decision criteria should one use to decide the best approach?

Thank you.

+1  A: 

Another possibility is to have a queue of tasks somewhere, and when applications that otherwise use the database perform some operation, they also do some tasks out of the queue. Wikipedia does something like this with its job queue. The scheduling isn't as certain as with the other methods, but you can e.g. put off doing housekeeping work when your server happens to be heavily loaded.

Edit: It's not necessarily better or worse than the other techniques. It's suitable for tasks that do not have to be performed by any specific deadline, but should be done "every now and then", or "soon, but not necessarily right now".

Advantages

  • You don't need to write a separate application or set up SQL Server Agent.
  • You can use any criteria you can program to decide whether to run a task or not: immediately, once a certain time has passed, or only if the server is not under heavy load.
  • If the scheduled tasks are ones like optimising indices, then you can do them less frequently when they are less necessary (e.g. when updates are rare), and more frequently when updates are common.

Disadvantages

  • You might need to modify multiple applications to cooperate correctly.
  • You need to ensure that the queue doesn't build up too much.
  • You can't reliably ensure that a task runs before a certain time.
  • You might have long periods where you get no requests (e.g. at night) where deferred/scheduled tasks could get done, but don't. You could combine it with one of the other ideas, having a special program that just does the jobs in the queue, but you could just not bother with the queue at all.
Doug
Doug, that's another possibility, but can you say how it would be better than the others?
John Saunders
Thanks, Doug: +1
John Saunders
I'd recommend you avoid reinventing the wheel when SQL ships with queued based, activation launched processing that supports timers since 2K5...
Remus Rusanu
@remus: good point, I didn't realise. You should use that instead of rolling your own.
Doug
A: 

I'd go with SQL Server Agent. It's well integrated with SQL Server; various SQL Server features use Agent (Log Shipping, for instance). You can create an Agent job to run one or more SSIS packages, for instance.

It's also integrated with operator notification, and can be scripted, or else executed through SMO.

John Saunders
A: 

I think the best approach for the decision criteria is what the job is. If it's a completely internal SQL Server task or set of tasks that does not relate to the outside world, I would say a SQL Job is the best bet. If on the other hand, you are retrieving data and then doing something with it that is inherently outside SQL Server, very difficult to do in T-SQL or time consuming, perhaps the external service is the best bet.

JP Alioto
JP, perhaps you mean, "unless you do this external thing with SSIS"?
John Saunders
John, Agreed -- excellent point. SSIS is a great way to touch the outside world. Personally, I've veered away a bit from scheduled SSIS packages because they can be somewhat expensive (spinning up a new process and such), but that may be a false perception on my part.
JP Alioto
+1  A: 

I generally go with the operating systems scheduling method (task scheduler for Windows, cron for Unix).

I deal with multiple database platforms (SQL Server, Oracle, Informix) and want to keep the task scheduling as generic as possible.

Also, in our production environment we have to get a DBA involved for any troubleshooting / restarting of jobs that are running in the database. We have better access to the application servers with the scheduled tasks on them.

Ron

Ron Savage
-1: With mixed platforms, you're right. But the question was about SQL Server.
John Saunders
Actually, it's a fairly open ended question and contains the phrase (if SQL Server) in it and then asks for opinions with reasons - which I gave. Personally, it irritates me to have perfectly accurate answers down voted based on your opinion of the OPs intent. My voting guideline is this:Better answers get an up-voteCorrect but not as good, no-voteIncorrect or bad (totally off topic), down-vote
Ron Savage
OS stored configuration and DB stored configurations don't mix very well primarily because of backup/restore implications and their propagation to high availability and disaster recoverability solutions.
Remus Rusanu
I've had better luck keeping them separate, our app servers have fewer problems (and have fail-over servers in any event) than our database servers.
Ron Savage
@Ron, you're right. I missed the "if SQL Server"
John Saunders
+1  A: 

You can't really rely on external processes. All 'OS' based solutions I've seen failed to deliver in the real world: a database is way more than just the data, primarily because of the backup/restore strategy, the high availability strategy, the disaster recoverability strategy and all the other 'ities' you pay for in your SQL Server license. An OS scheduler based will be an external component completely unaware and unintegrated with any of them. Ie. you cannot back/restore your schedule with your data, it will not fail over with your database, you cannot ship it to a remote disaster recovery site through your SQL data shipping channel.

If you have Agent (ie. not Express edition) then use Agent. Has a long history of use and the know how around it is significant. The only problem with Agent is its dependence on msdb that makes it disconnect from the application database and thus does not play well with mirroring based availability and recoverability solutions.

For Express editions (ie. no Agent) the best option is to roll your own scheduler based on conversation timers (at least in SQL 2k5 and forward). You use conversations to chedule yourself messages at the desired moment and rely on activated procedures to run the tasks. They are transactional and integrated with your database, so you can rely on them being there after a restore and after a mirroring or clustering fail over. Unfortunately the know how around how to use them is fairly skim, I have several articles about the subject on my site rusanu.com. I've seen systems replicating a fair amount of Agent API on Express relying entirely on conversation timers.

Remus Rusanu