views:

89

answers:

4

I'm trying to populate a table with user information in a MS SQL database with information from multiple data sources (i.e. LDAP and some other MS SQL databases). The process needs to run as a daily scheduled task to ensure that the user information table is updated frequently.

The initial attempt at this query/ update script was written in VBScript and would query each data source and then update the user information table. Unfortunately, this takes very long to run and update the user information table.

I'm curious if anyone has written anything similar and if you recommend or noticed a performance improvement by writing the script in another language. Some have recommended Perl because of multi-threading, but if anyone has any other suggestions on ways to improve the process or other approaches could you share tips or lessons learned.

A: 

Hmmm. Seems like you could cron a script that uses dump utils from the various sources, then seds the output into good form for the load util for the target database. The script could be in bash or Perl, whatever.

Edit: In terms of performance, I think the first thing you want to try is to make sure that you disable any autocommit at the beginning of the load process, then issue the commit after writing all the records. This can make a HUGE performance difference.

Don Branson
A: 

I can't tell you how to solve your particular problem, but whenever you run into this situation you want to find out why it is slow before you try to solve it. Where is the slow down? Some major things to consider and investigate include:

  • getting the data
  • interacting with the network
  • querying the database
  • updating indices in the database

Get some timing and profiling information to figure out where to concentrate your efforts.

brian d foy
+1  A: 

It's good practise to use Data Transformation Services (DTS) or SSIS as it has become known for doing repetitive DB tasks. Although this won't solve your problem, it may give some pointers to what is going on as you can log each stage of the process, wrap it in transactions etc. It is especially well suited for bulk loading and updates, and it understands VBScript natively so there should be no problem there.

Other than that I have to agree with Brian, find out what's making it slow and fix that, changing languages is unlikely to fix it on its own, especially if you have an underlying issue. As a general point my experience when using LDAP, which is pretty small, was it could be incredibly slow reader bulk user details.

MrTelly
A: 

AS MrTelly said, use SSIS or DTS. Then schedule the package to run. Just converting to this alone will probaly fix your speed issue as they have tasks that are optimized for bulk inserting. I would never do this in a script language rather that t-SQl anyway. Likely your script works row by row instead of on sets of data but that is just a guess.

HLGEM