views:

556

answers:

5

I'm working on a project that has a MySQL transactional database backing up a web application. The company uses SQL Server for back office and reporting applications. What is the best way to update SQL Server with the data from MySQL? Right now, we are performing a dump of the MySQL data and doing a full restore. This may not be feasible much longer due to the increasing size of the database.

I would prefer a solution that copies only newly inserted and updated rows. I also need the SQL Server database to be static after the updates are applied. Basically, it should change once a day. I can update SQL Server from a local copy of MySQL (i.e. not production) Is there a way to apply MySQL replication to a slave server at specified intervals? A perfect solution is to run a once daily update on MySQL that syncs the database as of a point in time.

A: 

Look into DTS, Microsoft's ETL tool. It's rather nice. Do the mapping, schedule it as a cron job, and Bob's your uncle.

duffymo
I imagine that I will use SSIS. The problem isn't the tool. The problem is the data source. I need a static copy of the data as a source. The production copy is a live database. My understanding is that MySQL slaves are also live.
Ed Mays
A: 

Regardless of how you do the import to SqlServer from the MySQL clone, I don't think you need to worry about restricting MySQL replication to specific times.

MySQL replication only requires one thread in the master server and basically just transfers the transaction log to the slave. If you can, put the master and slave MySQL servers on a private LAN segment so that replication traffic does not impact the web traffic.

ewalshe
A: 

if you have SQL Server Standard or higher, SQL Server will take care of all of your needs.

  • use ssis to grab the data
  • use agent to schedule your timed tasks

btw - I'm doing the exact same thing that you are doing. SQL Server is awesome - it was easy to setup (I'm a noob to SSIS) and it worked on the first shot.

mson
I understand how to do that. What I don't know how to do is get a snapshot, if you will, of the production data. It's in a data center, not on our network. I need a point in time that I know the data is in a consistent state. Otherwise, my target server will get out of sync quickly.
Ed Mays
SSIS can connect directly to production or on a MySQL slave - so, I'm not sure I understand your problem... If you mean you need to replicate a db and you don't have access to the source?
mson
I can connect without a problem. But the data is constantly changing. Data would change during the time the import ran. I need a copy of the data that is static during the time the SSIS process runs. Otherwise, data will be inconsistent.
Ed Mays
+1  A: 

Can you find a way to snapshot the mySQL DB and then do the copy? It would make an instant logical copy of the database which would be frozen in time.

http://aspiringsysadmin.com/blog/2007/08/13/consistent-mysql-backups-using-zfs-snapshots/

ZFS filesystem can do this - but you haven't mentioned your hardware/OS.

Also, perhaps you could restrict the data you are pulling - whatever is time sensitive so that your pull will only get data that is older than 1 hour if your pull takes 45 minutes. Or to make things a little safer - how about just pulling the day before?

I believe SSIS 2008 has a new module called 'maintain' table that does the common task of getting updated/inserted records and optionally deletes.

Sam
A: 

It sounds like what you need to do is to set up a script to start and stop replication on a slave database. If you can do that via a script, then you can establish a workflow in SSIS such as follows:

  1. Stop Replication to Slave MySQL Database
  2. If Replication has Stopped, then Take Snapshot of Slave MySQL Database
  3. If Snapshot has been Taken, then a= Start Replication to Slave MySQL Database b= Import Slave MySQL Database Replica into SQL Server

NB: 3a and 3b can run in parallel.

I think your best bet in such a scenario would be to use SSIS to enable and disable MySQL database replication to the slave as well as to take a snapshot of the slave database. Then you can drive the whole thing from the SQL Server Agent mechanism.

Hope this helps

Umar Farooq Khawaja