views:

106

answers:

4

I'm looking to implement an object that is journaled, or has a persistent transaction in it. That is, the object contains data (a Map perhaps). As changes are made to the data, those changes are held separately, sandboxed if you will, so that any outside object can either reference the base state (before the changes) or can access the latest data. Then there is another operation which commits the changes into the base state.

It reminds me somewhat of the Linux journaling file system. File system changes are written to a journal, and are only later committed into permanent storage.

It is also perhaps more similar to the concept of a "transaction" in the relational database world; that is, you have some data, you begin a transaction and manipulate the data in some way. Concurrent processes will see the old data with none of your changes. Then you can either "rollback" the transaction, or "commit" your changes.

I'm specifically looking to implement this in Java, but obviously it is a general object-oriented pattern, if it even exists. I'm hoping that it can at least be created but I'm not terribly sure of the best way to implement it.

Also, assume the object holds a whole ton of data, a whole hierarchy (sub objects and so on). So one cannot just keep two copies of the whole data tree; it would be very wasteful of memory and the copy operation (on commit) would take much too long. I'm looking to implement this in the context of a game, with one commit operation per frame, so it really needs to be optimal.

+1  A: 

What you are looking for is the Command Pattern. You create Command objects that do and undo the changes and just store some base state and then replay whatever commands needed to get the object into the state you want.

fuzzy lollipop
Not quite. I need the base state and the latest states to be randomly accessible; I will have two threads, one which works of of the base state and the other which works off the latest state. With the command pattern I would basically be flipping the object back and forth (undoing and redoing) and it would be VERY inefficient, as you can surely imagine.
Ricket
then you replay from the base state to whatever state you "randomly" need, once you replay to a specific point, you "rebase" or "snapshot" whatever state you need to run from and use that as your base and replay from there. This is the "standard" pattern for doing what you are looking for. If you haven't tried to implement it you __don't__ know how it will preform. This is how all journal ling systems work, there is no magic algorithm for doing this. The command are basically deltas from the previous state.
fuzzy lollipop
+3  A: 

The best way to achieve what you want is to make the object and all of its subobjects immutable. Then the two threads can operate on them without any conflicts, and you don't have to maintain two copies of everything. The only things that would require two copies are the things that actually change, and those can be very small.

Suppose object A is composed of objects B and C. Object B is composed of objects D and E. And object C is composed of objects F and G. So A, B, and C are each just two pointers, and D, E, F, and G are whatever they are.

First you create your initial instance and give it to both threads.

ThreadOne -> A1{ B1{ D1, E1 } C1{ F1, G1 } } 
ThreadTwo -> A1{ B1{ D1, E1 } C1{ F1, G1 } } 

So both threads are pointing at the same instance, no additional memory is consumed, and there are no threading issues, because the objects do not ever change.

Now ThreadOne needs to modify object F. To do this, it just creates a new F, a new C to contain it, and a new A to contain the new C. The original B, D, E, and G are unchanged, and do not need to be copied.

ThreadOne -> A2{ B1{ D1, E1 } C2{ F2, G1 } } 
ThreadTwo -> A1{ B1{ D1, E1 } C1{ F1, G1 } } 

The two threads are sharing the instances of B, D, E, and G.

Now ThreadOne needs to modify object E.

ThreadOne -> A3{ B2{ D1, E2 } C2{ F2, G1 } } 
ThreadTwo -> A1{ B1{ D1, E1 } C1{ F1, G1 } } 

Now ThreadTwo needs the latest version, so ThreadOne just gives it a pointer to its copy.

ThreadOne -> A3{ B2{ D1, E2 } C2{ F2, G1 } } 
ThreadTwo -> A3{ B2{ D1, E2 } C2{ F2, G1 } } 

Since the objects are immutable, there is no danger of any threading issues, and ThreadOne can go right on making changes, each time creating a new instance of only the parts that have changed, and their containers.

ThreadOne -> A4{ B3{ D2, E2 } C2{ F2, G1 } } 
ThreadTwo -> A3{ B2{ D1, E2 } C2{ F2, G1 } } 

ThreadOne -> A5{ B3{ D2, E2 } C3{ F3, G1 } } 
ThreadTwo -> A3{ B2{ D1, E2 } C2{ F2, G1 } } 

This is fast, memory efficient, and thread safe.

Jeffrey L Whitledge
Wow, that is actually really cool and I've never really looked into immutable classes until now. They have definite positives and I think this may indeed be the best solution to this problem! The only downside I see is the tediousness of recreating parent objects each time a child object changes, but for the other benefits I think I can cope with it.
Ricket
A: 

What you're describing sounds very much like the Memento pattern. However, you specified that you don't want to store the object's entire state, so that pattern may not work for you "out-of-the-box".

However, you might be able to modify it so that the memento included only the deltas, not the full object state. That would result in a smaller memory footprint. However, it would also mean that you could only roll-back the states in-order, since their deltas would build upon each other.

mikemanne
A: 

You can couple the Command and Memento patterns. You can have concrete commands that implement the commit and rollback operations and mementos that will save the state of the modified object before the commit.

Daniel Voina