views:

77

answers:

4

I'm very familiar with using a transaction RDBMS, but how would I make sure that changes made to my in-memory data are rolled back if the transaction fails? What if I'm not even using a database?

Here's a contrived example:

public void TransactionalMethod()
{
    var items = GetListOfItems();

    foreach (var item in items)
    {  
     MethodThatMayThrowException(item);

     item.Processed = true;
    }
}

In my example, I might want the changes made to the items in the list to somehow be rolled back, but how can I accomplish this?

I am aware of "software transactional memory" but don't know much about it and it seems fairly experimental. I'm aware of the concept of "compensatable transactions", too, but that incurs the overhead of writing do/undo code.

Subversion seems to deal with errors updating a working copy by making you run the "cleanup" command.

Any ideas?

UPDATE:
Reed Copsey offers an excellent answer, including:

Work on a copy of data, update original on commit.

This takes my question one level further - what if an error occurs during the commit? We so often think of the commit as an immediate operation, but in reality it may be making many changes to a lot of data. What happens if there are unavoidable things like OutOfMemoryExceptions while the commit is being applied?

On the flipside, if one goes for a rollback option, what happens if there's an exception during the rollback? I understand things like Oracle RDBMS has the concept of rollback segments and UNDO logs and things, but assuming there's no serialisation to disk (where if it isn't serialised to disk it didn't happen, and a crash means you can investigate those logs and recover from it), is this really possible?

UPDATE 2:
An answer from Alex made a good suggestion: namely that one updates a different object, then, the commit phase is simply changing the reference to the current object over to the new object. He went further to suggest that the object you change is effectively a list of the modified objects.

I understand what he's saying (I think), and I want to make the question more complex as a result:

How, given this scenario, do you deal with locking? Imagine you have a list of customers:

var customers = new Dictionary<CustomerKey, Customer>();

Now, you want to make a change to some of those customers, how do you apply those changes without locking and replacing the entire list? For example:

var customerTx = new Dictionary<CustomerKey, Customer>();

foreach (var customer in customers.Values) 
{
 var updatedCust = customer.Clone();  
 customerTx.Add(GetKey(updatedCust), updatedCust);

 if (CalculateRevenueMightThrowException(customer) >= 10000)
 {
  updatedCust.Preferred = true;
 }
}

How do I commit? This (Alex's suggestion) will mean locking all customers while replacing the list reference:

lock (customers)
{
 customers = customerTx;
}

Whereas if I loop through, modifying the reference in the original list, it's not atomic,a and falls foul of the "what if it crashes partway through" problem:

foreach (var kvp in customerTx)
{
 customers[kvp.Key] = kvp.Value;
}
+6  A: 

Pretty much every option for doing this requires one of three basic methods:

  1. Make a copy of your data before modifications, to revert to a rollback state if aborted.
  2. Work on a copy of data, update original on commit.
  3. Keep a log of changes to your data, to undo them in the case of an abort.

For example, Software Transactional Memory, which you mentioned, follows the third approach. The nice thing about that is that it can work on the data optimistically, and just throw away the log on a successful commit.

Reed Copsey
Yes, the crux of my question is what happens in those commit/rollback phases, though. What if there is an exception *during the commit*? Then you have a part-committed change. You can force your data into lists of objects, but then you run into the coarse locking scenario I updated my question with. I'm talking about really zooming right into the actual commit/rollback operation itself, to see how it copes if **it** is the thing that fails.
Neil Barnwell
There are lots of options here. I'd look at database theory and practice for common commit/rollback schemes, but they often include working on a full copy, and just switching a reference at the end, or locking whole objects (such as your customers lock), etc. Again, it's a huge field - hard to describe in detail, without a specific question.
Reed Copsey
+1  A: 

Take a look at the Microsoft Research project, SXM.

From Maurice Herlihy's page, you can download documentation as well as code samples.

Magnus Johansson
A: 
public void TransactionalMethod()
{
    var items = GetListOfItems();

    try {
        foreach (var item in items)
        {           
            MethodThatMayThrowException(item);

            item.Processed = true;
        }
    }
    catch(Exception ex) {
        foreach (var item in items)
        {
            if (item.Processed) {
                UndoProcessingForThisItem(item);
            }
        }
    }
}

Obviously, the implementation of the "Undo..." is left as an exercise for the reader.

Will Hartung
Yes, this makes a lot of sense. I've thought about implementing something like it, but the purist in me asks "what happens if there's an exception during the rollback?". How would you cope with that scenario?
Neil Barnwell
@Neil - Well, that's the game isn't it? Since you're working in memory, you will most likely only encounter an application exception, rather than, say, an I/O exception. You could encounter a memory exception, but, frankly, when that occurs you have REAL problems. For example, even though YOU may be able to cope with a OOM exception, most libraries can't, for example. But, anyway, obviously this is the aspect that makes all of this challenging. Arguably, since you've "already done it", "undoing it" should be safe, so the risk is low, but only your code will know for sure. I have 23 chars left.
Will Hartung
+1  A: 

You asked:

"What if an error occurs during the commit?"

It doesn't matter. You can commit to somewhere/something in memory and check meanwhile if the operation succeeds. If it did, you change the reference of the intended object (object A) to where you committed (object B). Then you have failsafe commits - the reference is only updated on successful commit. Reference change is atomic.

Alex
Yes, a *single* reference change is atomic, but what if you have a list of references, and you were only halfway through updating them during the commit phase when you got an exception?
Neil Barnwell
You're wrong. There will always only be one reference to update. All other references are from Object B to Object(s) X, Y, Z have been established in the prior commit phase. Once you confirm that Object B is indeed what you need it to be/commit was successful, you only establish the reference to B and gain full access to whatever it is referenced to itself
Alex
To clarify, Object B is the root instance, which you can always establish via wrapper if you don't have it already. You can always bring any number of descendant relationship to 1 root. That results in the single reference update, hence atomic.
Alex
Okay, so you're basically saying to update a list of objects, then replace (effectively) a List object with the List object containing the updated items? That's much clearer, thanks.
Neil Barnwell
Yes, hope that helps :)
Alex