I'm planning on making a distributed database system using a shared-nothing architecture and multiversion concurrency control. Redundancy will be achieved through asynchronous replication (it's allowed to lose some recent changes in case of a failure, as long as the data in the system remains consistent). For each database entry, one node has the master copy (only that node has write access to it), in addition to which one or more nodes have secondary copies of the entry for scalability and redundancy purposes (the secondary copies are read-only). When the master copy of an entry is updated, it is timestamped and sent asynchronously to nodes with secondary copies so that finally they will get the latest version of the entry. The node that has the master copy can change at any time - if another node needs to write that entry, it will request the current owner of the master copy to give that node the ownership of that entry's master copy, and after receiving ownership that node can write the entry (all transactions and writes are local).
Lately I've been thinking about what to do when a node in the cluster goes down, that what strategy to use for failover. Here are some questions. I hope that you would know available alternatives to at least some of them.
- What algorithms there are for doing failover in a distributed system?
- What algorithms there are for consensus in a distributed system?
- How should the nodes in the cluster determine that a node is down?
- How should the nodes determine that what database entries had their master copy on the failed node at the time of failure, so that other nodes may recover those entries?
- How to decide that which node(s) has the latest secondary copy of some entry?
- How to decide that which node's secondary copy should be promoted to be the new master copy?
- How to handle it, if the node which was though to be down, suddenly comes back as if nothing happened?
- How to avoid split-brain scenarios, where the network is temporarily split into two, and both sides think that the other side has died?