views:

529

answers:

2

We have a MySQL database (mostly read-only, so MyISAM tables) living in the data center that talks to a SQL Server DB living on-site. There is some significant latency in the WAN (upwards of 100ms); in about 6 months the SQL Server DBMS will be moving up to the data center (e.g. same gigabit LAN).

In the MySQL DB, I have several thousand rows that I need to update from results in the SQL Server Database. The MySQL DB is attached to a rails app running on Linux, so I'd like to keep the logic to migrate the data as much as possible in either shell scripts or rake/ruby tasks (we are not windows app developers so Win32 apps and the like are right out!).

It is a fairly simple process. In pseudocode:

SELECT id
     , amount
  FROM bank_account.on_SQL_Server
 WHERE (some logic)

FOREACH ROW:
  UPDATE bank_account.on_MySQL
     SET amount = $some_amount
   WHERE id     = $some_id

Let's assume that there are several thousand rows that need updated and done so frequently (every 5 minutes). Also assume that I have no way of knowing which rows in SQL Server had an amount change (unfortunately!) so I can't limit it to just modified rows -- I have to send them all over (yuck, but the SQL Server DB is a 3rd party app that can't be modified edit: we DO have control over the DBMS, so we could do some light modification, such as a trigger on the table or a new stored procedure -- just no table schema changes to add, say, a last-updated column -- but I would like to save this option as a last resort).

How best to do this update process minimizing runtime? This process will need to run every few minutes (sooner the better) and issuing dual connections to SQL Server and MySQL from Ruby is just too slow. It could be some write table locks issued by the MyISAM engine but converting to Innodb doesn't seem to make it any faster (the system is in test and so not easy to simulate the same sort of load that production would receive, leading me to believe that it is not lock related).

I'm currently leaning towards BCP'ing a VIEW (that corresponds to the SQL Server statement above) to a file, FTP'ing to the Linux server, then using Ruby to foreach the file (and execute a LOT of serialized SQL statements), but I have to imagine there are better ways.

A: 

You might be able to add another table to the MSSQL side that lest you find updated rows by caching old values or via triggers (yuck).

Orthogonal to that you might see if there is a MySQL engine that gives a remote view of a MSSQL DB like the federated engine does for MySQL.

BCS
Ah, so can one add a linked server that is MySQL, then update the table directly from SQL Server?
Matt Rogish
+1  A: 

You can add MySQL as a linked server to SQL Server following these instructions

From there you can do something like do a join between your mysql table and your mssql table where the amounts are not equal and update them accordingly. You can then run this as a SQL Job to keep the db's in synch every 5 mins.

eg.

SELECT * FROM mysql.dbo.bank_account myb 
INNER JOIN bank_account sqlb 
ON myb.id = sqlb.id 
AND sqlb.amount <> myb.amount

There may be smarter ways of figuring out what data has changed once you can query the mysql table from inside SQL Server that will use less resources, but this is a start.

Nick Kavadias
this is pretty rad. Should you copy-paste the sol'n in case the source disappears? I don't remember the etiquette
Matt Rogish
i dont think that source will disappear any time soon. Maybe its another good SO Question: Instructions for a SQL Server linked query
Nick Kavadias