If my understanding is correct, the problem is the following:
You want to create a distributed, scalable system and of course Erlang is the first choice that comes into mind, since it was designed for such purposes.
You will have several nodes that will be running local applications and also distributed applications as well.
Here the simplest hierarchy is to have a hot-standby backup for every major functionality.
This can be achieved by implementing a distributed application controller.
Simplest example is to have a server start on a node, while a slave server is started simultaneously on a mate node.
Distributed Application controllers have many advantages.
- Easy example is to handle node_up messages differently by introducing new messages that indicate that a node is not only erlang VM ready, but all vital applications are running. This way the mate node can be sure that the stand-by node is ready and can start sync-ing.
Please elaborate or comment if I misunderstood something.
Good luck!