In our projects, we've used ScaleOut StateServer (a commercial product - www.scaleoutsoftware.com) to accomplish distributed caching / replication of objects throughout a server farm for similar purposes. This has been quite effective, although using objects does incur (de)serialization costs, so in many cases we simplify what we're storing to just string values where possible.
We haven't fully evaluated the Velocity Project, since our usage started before that existed and we don't have time or compelling reasons to consider a switch at this point, but that obviously warrants some investigation if you're just starting now.
Edit: I did indeed miss the important part about the question - the flattening of object references. This may be over-complicating things or have other drawbacks, but what if you took the approach of more closely simulating database storage in your distributed cache (keeping it to storing a single copy of each distinct object entity, and using looser references to link those entities together)?
Example: you have a class, 'Group', which has a 'Leader' property and 'Members' collection, all containing objects that are instances of your 'Person' class. You'd have to use custom serialization to pull it off, and nothing will magically solve concurrency / dirty update problems, but the idea is that what you'd put into the distributed cache would actually be all individual 'Person' instances, as well as a shallow copy of the 'Group' instance itself. That shallow copy would serialize the normal 'Group' properties (name, etc), as well as unique identifiers for each 'Person' reference contained within (like the original database ID, a GUID, unique username, or whatever is appropriate) rather than the Person objects themselves. So you'd have a 'LeaderID' instead of the Leader, and the Members collection would serialize as a list of MemberID's. Each Person referenced is also stored as a separate object; that's where the concurrency trickiness comes into play.
When deserializing (or on access, depending on usage patterns), the Group shallow copy would follow all Person ID references and re-hydrate those references with the real Person objects stored separately in the distributed cache. You'd need to implement locking mechanisms to make sure updates to those objects, which could be shared among many different Groups, were safe. You'd also need a versioning mechanism and a 'dirty check' whenever necessary to re-read/pick up any changes to the Person object made within the distributed cache.
It does seem quite complicated, but that's the most generic approach I could think of without knowing the specifics of your use case.