I'm looking to build a scalable database solution for the back end of my website. I've been reading about database design lately, and I seem to have developed an idea on my own that might work. I think this is a novel way of maintaining n databases with synchronized data, but I could be wrong. So I'm asking SO to evaluate the idea and tell me if it's crazy or not. (or if it already exists and is implemented)
In this scheme there are a group of server nodes. One node runs a query load balancer (Let's call it A) and the rest are running a typical dbms, let's call those nodes N collectively.
Each N is disconnected from the others. ie) a node in N doesn't need to communicate with any of the others. Each N has a connection to A only.
The process works like this
- All database queries are passed through A. (Let's assume for now A has infinite throughput and processing ability)
- A inspects each query (Q) and determines if it is an operation that will read from a database or a query that will write to a database. (in sql, read would be select and write would be update)
- If Q is a read operation, forward it to one of the nodes in N
- if Q is a write operation, forward it to all of the nodes in N
Assuming it's implemented properly, this results in all of the nodes in N having synchronized database content. Queries that are only reading data need to be sent to one node.
This idea seems to work especially well for me because in my system there are very few write operations, less than 1%.
So a few questions on this idea
- Does a scheme like this make sense from a theoretical point of view?
- If this does make sense, is there an already implemented solution either commercial or free?