i understand that a master/slave setup is redundant, in that data is mirrored to all slaves from a central master. how does this vary from a distributed architecture?
views:
53answers:
2A master/slave relationship implies either a backup solution, or a failover solution. When the master becomes unavailable, the slave takes over, and functions as the new master until the master comes back up.
In a distributed architecture, the servers are basically equals. Any request can be served by any server, so long as the request is atomic.
A master slave relationship in the context of databases says that all slaves will replicate data from the master... However, in the end, every server is doing an equal number of writes (the master receives writes from the application, and the slaves receive the same writes from the master).
In a distributed system that implements horizontal scaling, you have multiple servers containing the same table schema, but each responsible for a portion of the overall data... No one machine needs to contain all the data.
For example, let's say you are storing user bookmarks. You could store each user's list in one table in a replicated setup, and every machine would receive all the data. Or you can store the list for users with uid%100<50's data on server1 and the rest on server2. As long as you don't need to do analytical queries over the full userbase, you're fine! Of course, you still need backups for each half anyway, considering server1 won't have server2's data.