Can anyone help me see in what kind of scenarios it would make sense to have one shared database transaction and multiple connections? Thanks.
views:
68answers:
3If you mean multiple databases all being updated within one transaction then you would do this for Atomicity - http://en.wikipedia.org/wiki/Atomicity_(database_systems)
Hypothetically consider a bank transfer, with a different database for each account provider - the money has to leave one account and be updated to the other. If it fails part way through - e.g. the 2nd database update fails, then the money has left one account but not arrived in the other, which is not acceptable.
The transaction means that a failure on one of the updates means that they are all cancelled (rolled back), to leave the data in the state that it was before the transaction began.
Personally I don't find it sensible in RDBMS -- however I can see it reducing design complexity for VERY high load databases.
For example in the e-commerce case you might have them partitioned so that product inventories are on one database and orders on another. In this case you would wan't to decrement a stock-count and increment a inivoice count when processing an order -- in that case a global transaction would make sense.
but 99% there is a better alternative that can be solved in the design.
-- edit : the pitfalls of global transactions --
These 2 points is why I would recommend not using global transactions
Point 1:
Global transactions involve multiple database servers (or at least they should) -- a global transaction requires a DTC (distributed transaction co-ordinator) -- employing such an agent will reduce the speed of your queries by ORDERS of factors since things aren't done in the scope of a single machine, but by involving multiple machines, which means the network.
Point 2:
If your queries aren't designed properly (most people do not understand the subtleties) you might end up locking large portions of the tables on the individual databases, sometimes people even end up locking entire tables with a single query. If things aren't properly designed for distributed queries your applications will come to a stand-still and someone will get fired :D. You need to make sure that your queries only end up locking only what they must and you must try to make sure these locked portions of the data are only concurrently used by one query.
Why is it worse to lock tables in a distributed query ? because of point 1. You locks now last orders upon orders of factors longer.
-- edit : potential area you might want to investigate --
Clustering technologies and HPC often make use of Distributed Lock Managers. You will learn a lot by studying the data management variants of these technologies, as they will show you where these implementations consider it is necessary to gain global locks (which is what a global transaction does).
When you want a transactional operation that affects multiple databases.