views:

260

answers:

3

I need to synchronize two databases. Those databases stores same semantic objects but physically different across two databases.

I plan to use a DTO Pattern to uniformize object representation :

DB ----> DTO ----> MAPPING (Getters / Setters) ----> DTO ----> DB

I think it's a better idea than physically synchronize using SQL Query on each side, I use hibernate to add abstraction, and synchronize object.

Do you think, it's a good idea ?

+1  A: 

Doing that with an ORM might be slower by order of magnitude than a well-crafted SQL script. It depends on the size of the DB.

EDIT

I would add that the decision should depend on the amount of differences between the two schemas, not your expertise with SQL. SQL is so common that developers should be able to write simple script in a clean way.

SQL has also the advantage that everybody know how to run the script, but not everybody will know how to run you custom tool (this is a problem I encountered in practice if migration is actually operated by somebody else).

  1. For schemas which only slightly differ (e.g. names, or simple transformation of column values), I would go for SQL script. This is probably more compact and straightforward to use and communicate.

  2. For schemas with major differences, with data organized in different tables or complex logic to map some value from one schema to the other, then a dedicated tool may make sense. Chances are the the initial effort to write the tool is more important, but it can be an asset once created.

You should also consider non-functional aspects, such as exception handling, logging of errors, splitting work in smaller transaction (because there are too many data), etc.

  1. SQL script can indeed become "messy" under such conditions. If you have such constraints, SQL will require advanced skills and tend to be hard to use and maintain.

  2. The custom tool can evolve into a mini-ETL with ability to chunck the work in small transactions, manage and log errors nicely, etc. This is more work, and can result in being a dedicated project.

The decision is yours.

ewernli
About 90 tables each side.
Zenithar
But how many rows?
ewernli
Thanks for this answer. I'm in the case that data a too different, on both side, and i am developping a migration tool first.Schema differences are the fact that i would like to use a physical abstraction to data. Also the data volume could be evaluated, it depends.I am currently studying the ETL solution using Talend software may be it could help.Again, thks for your answer.Regards.
Zenithar
+1  A: 

I have done that before, and I thought it was a pretty solid and straightforward way to map between 2 DBs. The only downside is that any time either database changes, I had to update the mapping logic, but it's usually pretty simple to do.

Kaleb Brasee
Yes i realize that too, but i think it's easier to update a hbm file (through hibernate synchronizer) than updating dark sql scripts.
Zenithar
"Dark sql scripts"? It's declarative programming, not black magic! :)
Matthew Wood
I know but from my point of view it's not THE solution. "Dark" was not a good choice maybe "Messy" is best ^^.I think it's a good way to synchronize Semantic Object, not their physical representation, that's why is use hibernate to have objects.But i need advices to take the decision, that's why i ask here THE question (which answer is not 42 of course).
Zenithar
A: 

Nice reference above to Hitchhiker's Guide.

My two cents. You need to consider using the right tool for the job. While it is compelling to write custom code to solve this problem, there are numerous tools out there that already do this for you, map source to target, do custom tranformations from attribute to attribute and will more than likely deliver with faster time to market.

Look to ETL tools. I'm unfamiliar with the tools avaialable in the open source community but if you lean in that direction, I'm sure you'll find some. Other tools you might look at are: Informatica, Data Integrator, SQL Server Integration Services and if you're dealing with spatial data, there's another called Alteryx.

Tim

Tim Ellison