Hi,
I have 2 heterogeneous databases. One in mysql and one in ms sql.
I want to keep them in sync.
There will be flow of data periodically and information flow will be both ways
Anyone got any strategies / approaches to it ??
Hi,
I have 2 heterogeneous databases. One in mysql and one in ms sql.
I want to keep them in sync.
There will be flow of data periodically and information flow will be both ways
Anyone got any strategies / approaches to it ??
Anand, you can find this in Google.
I've not used this software, but they offer a free trial
Assuming you aren't going to use some form of ready made solution, you've got a few options open to you. Basically what you're trying to do is find a way to capture the changes made in one database and replicate them in the other database.
Full Extract and Delta
Take a complete, sorted by key, dump of every row in the table(s) you want to sync and compare it row-by-row against the dump from the last sync you ran. Having the output sorted makes the compare process a lot quicker, as you can figure out if a row has been changed, removed or deleted without
This option should be quite viable for smaller or medium sized databases.
Transaction Logs
Analyze the transaction logs from the database in order to find out what changed, and apply those changes to the other database.
Possibly a good idea if you can count on the logs being available.
Triggers
Use triggers to record the changes, and replicate them to the other database.
Synchronization in the Application
Simply make sure the application writes to both databases.
This could be made to work if the application writes to the database only through a few controlled modules (ie. there's not many places to forget to update both databases). In a less managed setup (ie. multiple applications / uncontrolled or poorly factored database access / ad-hoc scripts) this simply isn't an option.
First more info is needed:
In general, if you need real-time synchronization then you end up with a replication solution. This can typically handle a very small amount of transformation (usually happens via stored procs). It is typically a commercial solution that sniffs logs. Since most people don't want to have a code dependency on log formats they almost always go with a packaged solution.
If you don't need real-time synchronization, have vast data volumes or have significant transformation requirements then you end up with an ETL solution. There are quite a few to choose from, but they are mostly commercial. On the other hand, they aren't difficult to develop yourself - if you take the time to understand best practices. Which oddly enough, really aren't talked about much. Anyhow, Adam Luchjenbroers did a good job identifying most of the approaches with ETL. I recommend the file delta approach if you can afford to loose transactions between snapshots - since it is otherwise the most accurate approach since all of the others rely on timestamps, triggers or logs which do not capture all changes.