views:

109

answers:

5

Hi,

I'm trying to learn more about database transactions, I found the ACID rule of thumb for writing transactions and thought of a few questions.

The ACID rule of thumb:

A transaction must be:

  1. Atomic - it is one unit of work and does not dependent on previous and following transactions.
  2. Consistent - data is either committed or roll back, no “in-between” case where something has been updated and something hasn’t.
  3. Isolated - no transaction sees the intermediate results of the current transaction.
  4. Durable - the values persist if the data had been committed even if the system crashes right after.

I was wondering how they work under the hood, so I can better understand the factors that need to be considered when writing such a transaction. I guess the specific details will vary between the database implementations that are avaliable, but certain rules will always be in place.

  1. How does the database handle concurrent transactions whilst still supporting the Atomic rule?
    • Is there a queue of transactions that is processed in order?
    • How is a lengthy transaction that is holding up all others handled?
  2. Are updates to tables done in memory so if a crash does occur before commit, there is no alteration to the database?
    • Or are there some intermediate tables that are updated to survive such a crash?
  3. Whilst a transaction is in progress, is all read and write access to the affected tables prevented?
    • Or would the database allow writes but the transaction would overwrite all changes upon commit?

Thanks

+4  A: 

The actual details would probably depend somewhat on which DB server it is, but this article might be of interest to you: Transaction Processing Cheat Sheet

ho1
+1  A: 
  1. There are many different ways, including transaction queueing, optimistic concurrency control etc. This is actually a very complex question, there are books written about it:

    http://www.amazon.co.uk/Databases-Transaction-Processing-Application-Oriented-Approach/dp/0201708728/ref=sr_1_3?ie=UTF8&s=books&qid=1281609705&sr=8-3

  2. It depends on the level of logging in the database. If strict write-ahead logs are kept then in the case of a system crash, the database can be wound back to a consistent state.

  3. It depends on the type of concurrency. Optimistic concurrency involves no locks, but if the state of the db has changed once the transaction has finished, it is abandoned and restarted. This can speed up dbs where collisions are rare. There are also different levels of locking: row,table, or even the entire db.

These are very complex questions, I'd advise buying a book, or attending a concurrent systems lecture series if you want to be able to fully answer them :-)

fredley
+2  A: 

A few nitpickings on your definitions:

Atomic - it is one unit of work and does not dependent on previous and following transactions.

A more correct definition of atomicity would not mention any "previous or following" transactions. Atomicity is a property of a single transaction taken by itself, namely that in the final countdown, either all of its actions persist, or none at all. In other words, it shall not be the case that "only half a transaction" is allowed to persist.

The concept is, however, blurred by concepts such as nested transactions, savepoints, and the ability for the user to request explicit rollbacks up to a taken savepoint. These do allow, in a certain sense, that "only half the actions of a transaction" persist, allbeit at the explicit user's request.

Consistent - data is either committed or roll back, no “in-between” case where something has been updated and something hasn’t.

This interpretation is totally wrong. Consistent means that the transaction processor (in this case, a DBMS engine) cannot leave the system (the database) in a state of violation of any declared constraint that it (the transaction processor) is aware of. See, for example, "Introduction to database systems", Chpt 16.

Isolated - no transaction sees the intermediate results of the current transaction.

Nitpicking : no transaction other than the current is allowed to see intermediate states (states, not really results). Note furtermore that the "Isolation levels" of transaction processing engines typically define the degree to which the I property can be violated !

Durable - the values persist if the data had been committed even if the system crashes right after.

But this property too is blurred a bit by the possibility of nested transactions. Even if an inner transaction has committed and completed, the containing transaction can still undo that commit by itself rolling back completely.

Erwin Smout
Thanks for the response. What is the verdict on nested transactions, wouldn't they contravene the Atomic and Isolated rules?
fletcher
A: 

"What is the verdict on nested transactions"

There is no verdict. Nested transactions exist. ACID properties exist. They're forced to co-exist. There are no absolutes. Certainly not to ACID.

Erwin Smout
+1  A: 

Consistent - data is either committed or roll back, no “in-between” case where something has been updated and something hasn’t.

I disagree with Erwin Smout's view of what Consistent means - your interpretation is closer to the money. By my interpretation of Ramakrishnan and Gehrke, a consistent state goes beyond the declared constraints of the system.

In the case of transferring money between two accounts by debiting one account and crediting another, the system could be in several states:

  1. Both accounts hold their initial balances;
  2. Amount is deducted from one account balance but not added to other;
  3. Amount is added from to one account balance but not deducted from other;
  4. Both accounts hold their final balances.

In all four states the integrity constraints of the system can hold. But the second and third do not match a reasonable view of the system - the money should be somewhere. So these are not consistent states, while the initial and final states are consistent.

Transactions don't automatically make a system consistent - they enable a user to write them to be so. A badly written transaction could have a bug that forgot to credit the second account. The transaction would run fine and integrity constraints would hold.

But a correctly written procedure takes the system from a consistent state, makes some changes that are temporarily inconsistent (e.g., money not in either account), and then brings the system back to a consistent state. By placing these steps in a transaction you are guaranteed that the system either reaches the final consistent state (when it commits) or returns to its initial consistent state (if it rolls back). Either way, consistency is retained.

beldaz