The core concerns as I saw them were (a) avoiding deadlocks and (b) exchanging data between threads. A lessor concern (but only slightly lessor) was avoiding bottlenecks. I had already encountered several problems with disparate out of sequence locking causing deadlocks - it's very well to say "always acquire locks in the same order", but in a medium to large system it is practically speaking often impossible to ensure this.
Caveat: When I came up with this solution I had to target Java 1.1 (so the concurrency package was not yet a twinkle in Doug Lea's eye) - the tools at hand were entirely synchronized and wait/notify. I drew on experience writing a complex multi-process communications system using the real-time message based system QNX.
Based on my experience with QNX which had the deadlock concern, but avoided data-concurrency by coping messages from one process's memory space to anothers, I came up with a message-based approach for objects - which I called IOC, for inter-object coordination. At the inception I envisaged I might create all my objects like this, but in hindsight it turns out that they are only necessary at the major control points in a large application - the "interstate interchanges", if you will, not appropriate for every single "intersection" in the road system. That turns out to be a major benefit because they are quite un-POJO.
I envisaged a system where objects would not conceptually invoke synchronized methods, but instead would "send messages". Messages could be send/reply, where the sender waits while the message is processed and returns with the reply, or asynchronous where the message is dropped on a queue and dequeued and processed at a later stage. Note that this is a conceptual distinction - the messaging was implemented using synchronized method calls.
The core objects for the messaging system are an IsolatedObject, an IocBinding and an IocTarget.
The IsolatedObject is so called because it has no public methods; it is this that is extended in order to receive and process messages. Using reflection it is further enforced that child object has no public methods, nor any package or protected methods except those inherited from IsolatedObject nearly all of which are final; it looks very strange at first because when you subclass IsolatedObject, you create an object with 1 protected method:
Object processIocMessage(Object msgsdr, int msgidn, Object msgdta)
and all the rest of the methods are private methods to handle specific messages.
The IocTarget is a means of abstracting visibility of an IsolatedObject and is very useful for giving another object a self-reference for sending signals back to you, without exposing your actual object reference.
And the IocBinding simply binds a sender object to a message receiver so that validation checks are not incurred for every message sent, and is created using an IocTarget.
All interaction with the isolated objects is through "sending" it messages - the receiver's processIocMessage method is synchronized which ensures that only one message is be handled at a time.
Object iocMessage(int mid, Object dta)
void iocSignal (int mid, Object dta)
Having created a situation where all work done by the isolated object is funneled through a single method, I next arranged the objects in a declared hierarchy by means of a "classification" they declare when constructed - simply a string that identifies them as being one of any number of "types of message receiver", which places the object within some predetermined hierarchy. Then I used the message delivery code to ensure that if the sender was itself an IsolatedObject that for synchronous send/reply messages it was one which is lower on the hierarchy. Asynchronous messages (signals) are dispatched to message receivers using separate threads in a thread pool who's entire job deliver signals, therefore signals can be send from any object to any receiver in the system. Signals can can deliver any message data desired, but not reply is possible.
Because messages can only be delivered in an upward direction (and signals are always upward because they are delivered by an separate thread running solely for that purpose) deadlocks are eliminated by design.
Because interactions between threads are accomplished by exchanging messages using Java synchronization, all race conditions and issues of stale data are likewise eliminated by design.
Because any given receiver handles only one message at a time, and because it has no other entry points, all considerations of object state are eliminated - effectively, the object is fully synchronized and synchronization cannot accidentally be left of any method; no getters returning stale cached thread data and no setters changing object state while another method is acting on it.
Because only the interactions between major components is funneled through this mechanism, in practice this has scaled very well - those interactions don't happen nearly as often in practice as I theorized.
The entire design becomes one of an orderly collection of subsystems interacting in a tightly controlled manner.
Note this is not used for simpler situations where worker threads using more conventional thread pools will suffice (though I will often inject the worker's results back into the main system by sending an IOC message). Nor is it used for situations where a thread goes off and does something completely independent of the rest of the system such as an HTTP server thread. Lastly, it is not used for situations where there is a resource coordinator that itself does not interact with other objects and where internal synchronization will do the job without risk of deadlock.
EDIT: I should have stated that the messages exchanged should generally be immutable objects; if using mutable objects the act of sending it should be considered a hand over and cause the sender to relinquish all control, and preferably retain no references to the data. Personally, I use a lockable data structure which is locked by the IOC code and therefore becomes immutable on sending (the lock flag is volatile).