My application runs in an erlang cluster - with usually two or more nodes. There's active monitoring between the nodes (using erlang:monitor_node) which works fine - I can detect and react to the fact that a node that was up is now down.
But how do I then find out that the node has restarted and is back in business? I can of course periodically ping the node until it is back up, but is there a better way that I've simply missed? Is process groups a better way of achieving this?
(Edited to add)
I think the answer to perform a technique like election of a supervisor is the thought process I was missing. I'll look into that and mark this question as done....